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abstract* A pair of complementary oligodeoxynucleotides (ODNs) uniformly substituted with 2-amino- 
adenine (A') in place of adenine and 2-thiothymine (T) in place of thymine did not hybridize to each 
other but did form very stable hybrids with unmodified complementary ODNs. These unusual properties 
were a consequence of the hydrogen-bonding properties of the two base analogs. Thermal denaturation 
studies of short duplexes which contained these bases demonstrated that the A'-T and A-T doublets 
formed stable base pairs whereas the A'-T doublet acted like a mismatch. Complementary ODNs 
substituted with these base analogs are referred to as SBC or selectively binding complementary ODNs. 
When used as a pair, these single-stranded ODNs invaded the ends of homologous duplexes and formed 
stable three-arm junctions under conditions where unmodified ODNs failed to give a product. SBC ODNs 
have a fundamental thermodynamic advantage in hybridizing to short segments of double-stranded nucleic 
acid and represent a new approach for the design of oligomeric probes and antisense agents. Many 
secondary structure features present in long single-stranded nucleic acids should be accessible to these 
reagents. 
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The use of oligodeoxynucleotides (ODNs) 1 as diagnostic 
probes and antisense agents is based upon the Watson-Crick 
base pairing of complementary nucleic acid sequences. 
These same hydrogen-bonding interactions also occur in long 
single-stranded DNA or RNA molecules, where they result 
in the formation of short, usually imperfect duplexes which 
can interfere with the hybridization of ODNs (Gamper et 
al., 1987; Chastain & Tinoco, 1993). Although numerous 
approaches have been described for the design of ODNs 
which overcome or exploit the natural tendency of DNA or 
RNA to base pair with itself, none provides a general solution 
to the problem. 

Modified ODNs which form very stable hybrids have 
frequently been used to improve the efficiency of hybridiza- 
tion to single-stranded nucleic acid. Examples include ODNs 
with 2'-modified (Monia et al., 1993; Sproat & Lamond, 

1993) , N3' — " P5' phosphoramidate (Gryaznov & Chen, 

1994) , and peptide (Nielsen et al., 1994) backbones, ODNs 
containing base analogs such as 2-aminoadenine (Lamm et 
al., 199l)orC5 propynylpyrimidines (Wagner etaL, 1993), 
or ODNs conjugated to an intercalating agent (Asseline et 
al., 1984) or a minor groove binding agent (Afonina et al., 
1996). However, none of these modifications guarantees 
efficient hybridization if the targeted sequence is already 
substantially base paired to another sequence in the same 
molecule. 

Hybridization strategies which rationally exploit specific 
sequence or structural elements within a single-stranded 
target have also been described. Circular ODNs capable of 

* Corresponding author. Telephone: 206-485-8566. Fax: 206-486- 
8336. E-mail: howardg@epochpharm.com. 

• Abstract published in Advance ACS Abstracts. August I. 1996. 

' Abbreviations: A'. 2-aminoadenine: CPG, controlled pore glass; 
HPLC. high-performance liquid chromatography: ODN. oligodeoxy- 
nucleotide: SBC. selectively binding complementary; TAJ. three-arm 
junction; T m . melting temperature; T\ 2-ihiolhymine. 



triple-strand formation with a homopurine or homup) rirni- 
dine run exhibit enhanced binding affinities relative to c-mpA 
ODNs which only form a duplex (Prakash & KooL ivi. 
Wang & Kool, 1994). Tethered ODNs complementary to 
two single-stranded sequences in close proximity to i»nc 
another (R ichardson & S chefkrtz, 1991) or separate ODN* 
which bind to contiguouslequences (Kutyavin et a!.. iWKi 
can exhibit cooperative binding. Localized hairpins arc 
frequently found in long single-stranded nucleic acids and 
can be rationally targeted. Formation of a pseudoknot b> 
hybridization of an ODN to a hairpin loop can signiru';im!} 
enhance hybrid stability (Ecker et al., 1992; Lima et al.. 
1992). Alternatively, an ODN can hybridize to the t*" 
single-stranded arms of a hairpin (Francois et al., 1994). In 
the special case where the hairpin stem contains a homo- 
purine run, a single ODN can bind to both the stem and one 
of the flanking single-stranded arms (Brossalina & Toulme. 
1993; Francois & Helene, 1995). 

We describe the hybridization and strand invasion proper-' 
ties of a new class of modified ODNs designed to be used 
as paired complements. Substituted with 2-thiothyniine jnd 
2-aminoadenine, these selectively binding complementary 
(SBC) ODNs form very stable hybrids with complementer) . 
unmodified sequences, yet they do not interact with each 
other. Hybridization of a complementary pair of single- 
stranded SBC ODNs to both strands of a short honwlueou* 
DNA or RNA duplex should be kinetically favored since 
the pathway does not involve the formation of an intermedi- 
ate Holliday junction and thermodynamically favored due 
to an increase in the number of base pairs (see Figure I » 
As a consequence, we show that two SBC ODNs (but not 
the corresponding unmodified ODNs) strand invade both 
blunt-ended and recessed duplexes to form stable three-arm 
junctions. These model experiments suggest that the hy- 
bridization of paired SBC ODNs to long single-stranded 
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Figure I: (A) Unmodified three-arm junction undergoes branch 
migration to yield two duplexes. During branch migration, crossover 
of strands occurs within a Holliday junction. In the presence of 
MgCI:, the step time for branch migration is relatively long 
(hundreds of milliseconds (Panyutin et aL 1995)] because the four 
- double-stranded arms of the junction stack to form two colinear 
helices (Lilley & Clegg, 1993). Branch migration in either direction 
preserves iota! base pairing. (B) Two SBC ODNs strand invade a 
longer homologous duplex to form a stable three-arm junction. Each 
SBC ODN is complementary to one strand of the duplex. Since 
the crossover junction contains three double-stranded and two 
single-stranded arms, its movement should be more facile than that 
of ihe highly structured Holliday junction. Strand invasion is driven 
by the formation of new base pairs. The total number of base pairs 
is indicated for the starting and ending states of the DNA, assuming 
each junction has three 20 bp long arms. 

DNA or RNA should be less inhibited by the presence of 
secondary structure than a regular ODN. 

MATERIALS AND METHODS 

5'-0-(DimethoxytrityI)-2-thiothymidine-3 / -0-(2-cyano- 
ethyl)-iVJV'-diisopropylphosphoramidite was prepared using 
the procedure of Connolly and Newman (1989). 2,6- 
Diarninopurine 2'-deoxy riboside was synthesized as de- 
scribed by Fathi et al. (1990). 5'-0-(DimethoxytrityI)-A^,M- 
bis(phenoxyacetyl)-2.6-diaminopurine deoxyriboside-3'-0- 
(2<yanoemyl)-^JV-diisopropy]phosphorarnidite was prepared 
essentially as described for the 2'-0-aIlyl analog (Sproat et 
al.. 1991 ). A polystyrene support for DNA synthesis (Primer 
Support, Pharmacia) was modified with hexanol according 
to the procedure described for hexanol CPG (Gamper et al 
1993). 

Oligonucleotide Synthesis. SBC ODNs were synthesized 
on a Pharmacia OligoPilot DNA synthesizer in 10/miol scale 
using hexanol Primer Support Deprotection and detachment 
from the solid support was accomplished with concentrated 
ammonia at 40 °C for 15 h. The remaining second 
phenoxyacetyl group on 2,6-diaminopurine residues was 
removed with a mixture of hydrazine/ethanolamine/methanol 
(1:5:5, v/v/v) (Polushin & Cohen, 1994). Purification of 
trityl-on ODNs was performed on a Hamilton PRP-1 (7.0 x 
305 mm) reverse phase column employing a gradient of 5 
to 40% acetonitrile in 0.1 M NaC10 4 (pH 7). After 
detritylation with 80% acetic acid for 15 min at room 
temperature, the ODNs were precipitated by addition of a 
2% solution of NaC10 4 in acetone. 

Enzymatic Digestion. To determine the stability of the 
modified bases to the deprotection conditions and evaluate 
the purity of synthesized ODNs, about 20 fig of each ODN 
was digested to nucleosides by a mixture of phosphodi- 
esterase I, DNase I, and alkaline phosphatase. The hydroly- 
sates were analyzed by reverse phase HPLC using a C18 (2 
x 150 mm) column and a Waters 994 photodiode array 
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detector. We detected less than 5% of the impurities derived 
from the modification of 2-thiothymine and 2,6-diamino- 
purine bases. 

Thermal Denaturation Studies. Hybrids formed between 
complementary ODNs were melted at a rate of 0.5 °C/min 
in 200 mM NaCl, 0.1 mM EDTA, and 10 mM Na 2 HP0 4 
(pH 7.0) in a Lambda 2 (Perkin-Elmer) spectrophotometer 
equipped with a PTP-6 automatic multicell temperature 
programmer. Each ODN was mixed with an equimolar 
amount of complement to give a total strand concentration 
of 8 x 10 7 M. Prior to melting, samples were denatured 
at 100 °C and then cooled to the starting temperature over 
a 10 min period. The melting temperatures (T m values) of 
the hybrids were determined from the derivative maxima. 
Free energies were calculated according to a "two-state" 
model by minimization of mean square errors between the 
calculated and experimental melting curves (Petershiem & 
Turner, 1983). 

Gel Mobility Shift Assays. Sequential hybridization and 
strand invasion experiments were conducted at room tem- 
perature (25-27 °C) in 200 mM NaCl, 0.1 mM EDTA, and 
10 mM Na 2 HP0 4 (pH 7.0) as described in the figure legends. 
The concentrations of invading 20-mers and target duplex 
strands were 2.5 x 10" 6 and 5 x 10 -7 M, respectively. Prior 
to use, ODN 3 was 5' end-labeled with 32 P using polynucleo- 
tide kinase and [y- 32 P]ATP. The final reaction volumes were 
20-200 //L. Aliquots (5/*L) were removed at specific times 
into 5 iiL volumes of cold dyes, ficoll and 2 mM MgCI : , 
quickly frozen in a dry ice bath, and stored at -20 °C until 
they were analyzed. Immediately prior to loading onto a 
gel, each aliquot was thawed ip an ice bath. Electrophoresis 
was conducted in a precooled 8% nondenaturing polyacryl- 
amide gel (0.04 x 20 x 40 cm) containing 89 mM Tris- 
borate (pH 8.3), 2 mM EDTA, and 3 mM MgCl 2 for 4 h at 
10 W. The dried gel was visualized by autoradiography, 
and bands were quantified using a BioRad GS-250 phos- 
phorimager. Control studies showed that storage of aliquots 
prior to electrophoretic analysis did not alter the distribution 
of products. 

RESULTS 

Design of SBC ODNs. Selectively binding complementary 
ODNs are defined as self-complementary ODNs or pairs of 
complementary ODNs which do not interact with each other 
under physiological conditions and so exist as single-stranded 
molecules. While unable to base pair to each other, every 
SBC ODN can by design hybridize to a complementary DNA 
or RNA sequence with no mismatching. The hybrids so 
formed exhibit stabilities comparable to or greater than those 
of regular hybrids. These unique properties should promote 
strand invasion of paired SBC ODNs into a segment of 
double-stranded DNA or RNA. The kinetics of the reaction 
should be accelerated due to the reduced likelihood of 
forming transitory Holliday junctions, while the free energy 
of the complex should be reduced by the creation of 
additional base pairs. Figure 1 shows the stable three-arm 
junction formed when using SBC ODNs to strand invade a 
duplex. Also shown is the spontaneous decay of the 
corresponding unmodifed three-arm junction. These reac- 
tions will be described more fully once the experimental 
results have been presented. 
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Sequence 

S ' - GTAAGAGAATTATGCAGTGC 

•5I' A ' A 'GA'GA'A'T'T'A'T'GCA'GT'GC 
5 -GTA* A'GA'GA'A'TTA'TCCA'GTGC 
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S ' - GCACTGCATAATTCTCTTTAC 

5 -GCA'CT'GCA'T'A'AT'T'CT'CT'T'A'C 

S; -G^'CTGCA'TA'A'TTCTCITA'C 

5 -GCACT'GCAT'AAT'T'CT'CT'T'AC 
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Figure 3: Oligonucleotides used in this study la and 2a are nnm.i 
(unmod,fied, ODNs. while lb and 2b « SKOTNs^Wd 
£ IT fn W^ -4 and 3 " 5 « homologous to ODNs la 

S£ SSTZZSZ TSSZk.y SB 

To accomplish the design goals, SBC ODNs were syn- 
InH^ i S St,tU 'i 8 2 - aminoad enine (A') for adenine (A) 
and 2-,h,o,hyrn,„ e (T) for thymine (T). Figure 2 shows the 
hydrogen-bonded pairs formed by these bases. The A'-T 
base pair possesses an extra hydrogen bond relative to A-T 
(Howard & Miles. 1984) and is frequently used to stabilize 
nucleic ac.d hybrids (Azhikina et a!.. 1993; Sproat & 

hydrogen-bondmg pattern as A-T, and its introduction into 

(Connolly & Newman, 1989; Newman et al., 1990 ; Ku imelis 
& Namb.ar. 1994). By contrast, the A'-T 1 doublet shouU 

SJ-tt' m,s r c , h -, Modei buiidin « indi -«« *™£ 

«muo of I'" 6 , 2 "*» *«* ° f and the 2-amino 

group of adenine t.lts the bases relative to each other, thereby 
allowing only one hydrogen bond to form. * 

^il! 1 ™" 0 ? Pr ?. Der,i " °f ODNs. On the basis of 

??h£h gen " b ° nd,ng Pr0penies of 2-aminoadenine and 
2-thiothymine, we expected complementary ODNs which 
comamed these bases to exhibit SBC properties For he 
S>% S ^ Hbed ^ T ^P'-entafy ££J££ 

Sne^dr^ T^""* Wkh K & ular ^modified 
i n Sr* , %m,ne bases to generate the eight ODNs listed 
>» F.gure 3. Each of the 16 possible hybrids was formed in 
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Figure 4: First-derivative melting curves of representative hyh ri j,. 

10 mM Na 2 HP0 4 (pH 7.0) buffer containing 200 mM \ a n 
and 0.1 mM EDTA and assayed by ultraviolet monitored 
tnermal denaturation. 

Representative first-derivative melting curves are sh.mn 
J .Figure ,4. The DNA-DNA hybrid (1,-2.) had a 7 , of 
?t a Whereas the two alte mative SBC-DNA hybrids . la- 
2b and lb-2a) had T m values approximately 10 °C hrhcr 
The relauvely sharp, symmetric peaks obtained for the SBC- 
DNA hybrids indicated that A'-T and A-T base pairs .(id 
not alter the cooperativity of hybridization. The SBC-SBC 

that of the DNA-DNA hybrid and 40 °C lower than uW 
of the two SBC-DNA hybrids. Above approximateh 35 
C the two complementary SBC 20-mers coexisted as sinW,- 
stranded molecules capable of forming stable fullv base- > 
paired hybrids with normal complements. By these criteria 

nrL met -^ r 'I 6 ' 5 ""' 0 " ° f Se,efc,i Ve, y bindin 8 complement , 
ODNs. The destabilizing effect of a A'-T doublet rclamc 
to an authentic mismatch was estimated in separate L-vpori- 
ments which showed that the T m of the control duplc v < !a- 

2 7t°", aV t? 8C dCpreSSed 2 4 ° C for each introd U ..i,.n 
of a A -T doublet versus 6.3 "C for each introduction of a 
I - r mismatch. 

The free energy (AG 9 ) of each hybrid was determined 
using the two-state approximation for melting (Pet,^hi,,n 
& iurner, 1983). These values are summarized together uiih 
the r m results in Table I. At 37 °C the two SBC-DW 

•h ( r!Kr": 2b r. and lb ~ 2a) were 1 kcaI/mo » «»« **lc 
than the DNA-DNA hybrid (la-2a) and 1 1 kcal/mol more 

stable than the SBC-SBC hybrid (lb-2b). On the ha^, 

«?^S - CSt thC e< J uiIibrium binding constant tor the 
!>BC-DNA duplexes was nearly 8 orders of magnitude 
greater than that for the SBC-SBC duplex. The free enew 
values for the other hybrids in Table 1 were as amie^ted 
and further confirmed that the A'-T. A-T, and A'-T' ha* i 
juxtapositions, respectively, stabilized, had little effect on. 1 
or destabilized the duplex under study. 

Formation and Stability of Mobile Junctions. The hv- 
bndization properties of SBC ODNs lb and 2b should fawn 
the formation of stable, mobile three-way junctions between 
these ODNs and longer unmodified duplexes which contain 
the same sequences at one end. Two 40-mer hybrids (see 
Figure 3) were prepared to test this assumption, one with a 
blunt end (hybrid 3-4) and the other with a five-base Ion- 
recessed end (hybrid 3-5). Three-arm junctions *m 
formed two different ways. In the first protocol, each 
member strand of the longer duplex was separately Inbrid- 
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j Table I: Melting Transition Temperatures and Free Energies for 
[Hybrids Substituted with 2-Aminoadenine and/or 2-Thiothymine 
(Base Analogs" 
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M-2a 
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57 


ld-2b 


A-r. A'-r 


47 
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A-T. A'-r 


4! 


ld-2d 
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61 



-AC7V 
(kcal/mol) 



16.2 
17.3 
17.2 

6.2 
17.4 
16.6 
13.2 
10.7 
18.8 
10.8 
20.3 

8.6 
13.6 
11.3 
10.1 
14.1 



8.1 
11.2 
10.7 
0.3 
9.3 
9.4 
7.0 
4.0 

10.0 
4.2 

12.1 
2.3 
9.5 
6.9 
5.4 

10.0 



- Determined in 200 mM NaCI, 0.1 mM EDTA, and 10 mM 
1 Na : HPOa (pH 7.0) with a I6//M total nucleotide concentration *The 
4 P"^ n « * A-T. A'-T. A-r, and A'-r doublets in each hybrid is 
1 indicated. • Each reported value for T m and free energy is an average 
_ of ji least three separate experiments; uncertainties in T m values and 
j a trcc cner g'e* ^ estimated at ±1.0 »C and ±15%, respectively 



,ized to the complementary "invading" 20-mer, after which 
. boih partial hybrids were combined to form a three-arm 
j junction. Sequential hybridization permitted the for- 
mation of otherwise unstable junctions involving unmodified 
jODNs. In the second protocol, the longer duplex was 
preformed and then incubated at room temperature with the 
two invading 20-mers. The high T m values of hybrids 3-4 
and 3-5 (respectively. 69 and 68 °C) were expected to 
; ensure that any three-arm junction would have been gener- 
ated by strand invasion. Both SBC (lb and 2b) and 
unmodified (la and 2a) ODNs were tested for branched 
molecule formation. J 

? A gel mobility shift analysis was used to detect three-arm 
junction formation with the blunt-ended hybrid (Figure 5) 
Sequential hybridization using the SBC 20-mers generated 
a good yield of the three-arm junction (lane 5). whereas the 
same protocol using normal 20-mers generated much less 
of the desired junction (lane 3). Substitution of 2-amino- 
ademne for adenine in the unmodified ODNs to promote the 
formation of more stable branched molecules did not improve 
the yield (lane 4). Sequential hybridization using only one 
SBC or normal 20-mer failed to generate any branched 
product (lanes 7 and 8). The joint molecule formed in these 
reactions appears to have undergone spontaneous branch 
migranon to yield a single-stranded 20-mer and a double- 
stranded 40-mer (Radding et al., 1977). 

The stability of the three-arm junction depended upon 
whether normal or SBC 20-mers were used in the sequential 
hybridization protocol. When the junction contained un- 
modified ODNs. branch migration required many hours to 
resolve the four component strands into separate 40-mer and 
20-mer duplexes (data not shown). The smear in lanes 3 
and 4 of Figure 5 is attributable to this reaction occurring 
during the course of electrophoresis. By contrast, the three- 
arm junction formed using SBC ODNs was very stable and 
did not undergo branch migration (lanes 5 and 6 in Figure 
U This was expected since resolution of the junction would 
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Figure 5: Gel shift analysis of branched molecules formed by the 
interaction of normal or SBC 20-mers with the blunt-ended duplex 
3-4. Reactions were carried out at room temperature (25-27 °C) 
in the standard buffer. ODN 3 was end-labeled and present at a 
final concentration of 5 x 10~ 7 M. When used in forming a 
branched molecule in this or other experiments, the molar ratio of 
ODNs 3. 4, la-c, and 2a-c was 1: 1:5:5. Intermediate hybrids used 
in the sequential hybridization protocols were formed by incubating 
the respective strands for at least 10 min. Controls: lane 1, ODN 
3; lane 2, duplex 3-4. Sequential hybridization of ODNs to form 
a three-arm junction followed by a/l5 min incubation: lane 3 
duplex 3-2a added to duplex 4-la?W 4, duplex 3-2c added 
to duplex 4-Ic; lane 5, duplex 3-2b added to duplex 4-lb 
Sequential hybridization of ODNs to form a three-arm junction 
followed by a 12 h incubation: lane 6, duplex 3-2b added to 
duplex 4-lb. Sequential hybridization to form a D-loop-like 
molecule: lane 7, ODN 4 added to duplex 3-2a; lane 8, ODN 4 
added to duplex 3-2b. Strand invasion of duplex 3-4 to form a 
three-arm junction (12 h incubation): lane 9, ODNs la and 2a 
lane 10, ODNs lb and 2b. At the completion of each reaction an 
aliquot was quickly frozen in dry ice and stored at -20 °C until it 
was analyzed by electrophoresis in a precooled (10 °C) 8% 
nondenaturing polyacrylamide gel run at room temperature. The 
closely spaced doublets in lanes 3 and 4 probably represent different 
conformers of the same three-arm junction; such doublets have been 
reported previously (Zhong et al., 1994). 



have been accompanied by a reduction in the total number 
of base pairs. 

A thermostability experiment showed that 50% of the 
three-arm junction formed between the SBC 20-mers and 
the blunt-ended 40-mer hybrid had resolved into the com- 
ponent duplex and single strands at 62 °C (Figure 6). This 
apparent T m was only somewhat lower than the T m of the 
blunt-ended 40-mer duplex in the same buffer and explained 
why the amount of three-arm junction increased above 65 
°C. At those temperatures, both the three-arm junction and 
the underlying 40-mer duplex underwent denaturation. 
When aliquots of these solutions were rapidly cooled for 
storage, the three-arm junctions reformed. 

Strand invasion of preformed hybrids was only observed 
with the SBC ODNs. These complementary ODNs formed 
stable three-arm junctions with both the blunt-ended hybrid 
(Figure 5, lane 10) and the recessed hybrid (Figure 7, left 
lanes 1-5). By providing an annealing site for one of the 
SBC ODNs, the recessed hybrid was more rapidly invaded. 
When incubated with the same hybrids, normal ODNs were 
devoid of strand invasion activity (Figure 5, lane 9; Figure 
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Figure 6: Thermostability of the three-ami junction formed 
between the blunt-ended duplex 3-4 and paired SBC 20-mers The 

I?E£"T JU 1 c ^ on {omei b * s^uemial hybridization of SBC 
ODNs lb and 2b with duplex strands 3 and 4 as described in Figure 
5. The temperature of the solution containing the junction was then 
raised in 5 C increments stoning from 25 °C. After 10 min of 
incubation at each temperature, an aliquot was quickly frozen in 
dry ,ce and stored a. -20 °C while awaiting gel analysis. Bands 

aETSS??? 10 ,h T: ann junc,ion and 40 - mer duplex were 
detectable. The percent of three-arm junction was determined using 
a phorphonmager. Identical results were obtained when reaction 
aliquots were immediately run on a gel without prior storage. 

7. right lanes 1-5). This was not surprising since these 
complementary ODNs hybridized to each other. 



DISCUSSION 

The single-stranded character of paired SBC ODNs is key 
to their remarkable strand invasion properties. At the 
concentrations used here, paired SBC ODNs can readily 
anneal to a homologous duplex which contains a single- 
stranded overhang or a transiently frayed blunt end. Invasion 
of the duplex by both ODNs follows. Use of a single self- 
complementary SBC ODN would reduce the molecularity 
of the annealing step and further promote strand invasion 
ta either case, strand invasion is energetically favored by 
the creation of one additional base pair for each step of 
invasion and may be kinetically favored by the unorthodox 
structure of the branch migration intermediate. Normally 
strand exchange between two duplexes occurs within the 
context of a Holliday junction. In the presence of MgCfe 
the four double-stranded arms of a Holliday junction form' 
two quasicontinuous helices (Lilley & Clegg. 1993) which 
retard branch migration (Panyutin & Hsieh, 1994; Panyutin 
Cl • r n By contras t. the unusual junction formed when 
using SBC ODNs contains three double-stranded and two 
single-stranded arms (see the intermediate in Figure IB) We 
hypothesize that the absence of a fourth duplex arm will 
increase conformational freedom at the crossover point and 
promote strand invasion both in the presence and in the 
absence of MgCI 2 . The rate and extent of strand invasion 
observed here would have been improved by conducting the 
expenments at 37 °C instead of at room temperature At 
"ie h.gher temperature, the paired SBC ODNs would have 
been totally single-stranded and the ends of the duplex more 

Using two different protocols, the paired SBC ODNs 
formed stable, mobile three-arm junctions whereas the paired 
unmodified ODNs did not. While the SBC-containing 
junctions were slightly more stable man the unmodified 
junctions due to the presence of A'-T base pairs, 
that the suscept.b.Iity of three-arm junctions to resolution it 
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Fwure 7; Strand invasion of the recessed duplex 3-5 bv ra ,rcd 
SBC or normal 20-mers. Reaction conditions and elevuWwtv 
analyse were as described in Figure 5. The 3-5 duplex (4 x |o 
M) was incubated with a 5-fold molar excess of norma4 ODNs la 
and 2a or SBC ODNs lb and 2b. Aliquots were removed ai £ 
indicated times to determine the extent of strand invasion The 
percent yield of three-arm junction was determined h\ rhn\. 
phonmage analysis of the dried gel: left lanes 1-5, SBC ODN » 
nght lanes I -5, normal ODNs. TAJ = three-arm junction. 

primarily determined by the relative base pairing count of 
branched versus linear associations of the strands. Elusion 
of two SBC ODNs from a three-arm junction would h- 
accompanied by a reduction in the total number of base pairs 
making the branched molecule the more stable species 
despite the presence of a junction. On the other hanJ 
extrusion of two normal ODNs from a three-arm junction 
should occur with no loss of base pairing. In this more 
typical situation, the three-ami junction would be le.^ stable 
than the two component linear duplexes (Lu et al.. I99|- 
Leontis et al., 1994) and susceptible to resolution <Panvuiin 
& Hsieh, 1993). 

For optimal strand invasion of a short duplex, each of tJv 
two SBC ODNs should form a hybrid with the respective 
complementary strand which is fully base-paired and has 
equal or greater stability than the starting duplex. Clearlv. 
SBC ODNs with A' and V substitutions meet these criterui. 
By contrast, two complementary ODNs with a limited 
number of mismatched bases at mutually exclusive position 
might not interact with each other and yet form weak hybrid* 
with complements possessing the wild-type sequences/ Such 
ODNs would not likely strand invade and do not meet the 
definition of SBC ODNs. The robustness of a pair of SBC ' 
ODNs can be quantified by comparing the difference m 
stability between the normal hybrid and the SBC-SBC 
hybrid to the sum of the stability changes observed uhen 
only the Watson or the Crick strand of the hvbrid i% 
substituted with an SBC ODN. The greater this difference 
the more powerful the SBC properties. 

Rules for the design of SBC ODNs using A' and V tu^ 
have yet to be defined On the basis of the free energ) value* 
in Table 1, one can assume that each A'-T doublet in an 
SBC-SBC duplex contributes 0.8 kcal/mol of destabi!i/J- 
tion. This assumption provides a framework for predicting 
whether a given pair of ODNs can exhibit SBC propose*. 
Both length and sequence will be important factors in 
determining the T m of paired oligomeric complements. It i* 
uncertain whether 2-aminoadenine will compromise the 
specificity of SBC ODNs. Although capable of deserve 
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base pairing (Cheong e*al.. 1 988 Uhis analog is used in 
place of ademne by the S-2L cyanophage (Kimos et al.. W 
.^udyakov e, ai.. 1978). and the triphosphate of 2-amS 

The design of the A'-T' ba.se pair has taken advantage of 

introduc .on of an amino group at this position together with 

thjmine has prov.ded two base analogs wh£h oermi 
substituted ODNs to discriminate bctwee^ iSSu^S 

Tr ST ' Cment , S - ThC de$ign ° f C/C ana, oS* ^ use in 
SSJ T ° f nCCCSSi,y Usc ditTercm strategies. The 
T a u t Pr , 0penleS ° f 0DNs substituted with such 
analogs will be described elsewhere (Woo et al.. 1996 

J2T m0 £V S , ,rand invasion substrates employed in dm 
«u«iy can be hkened to the stem of a hairpin which might 
be found ,n a long single-stranded nucleic acid. Wni e a 

T " . USUa " y eqm ' Pped l ° deaI a 
structure, we have shown that paired SBC ODNs can readily 

mvade a short duplex. In theory, paired SBC ODNs SouM 

nal,y f ° r bindin « t0 «™*«d regionsln single 
£n*d DNA or RNA due to a ne, increase Kie pStaT 
By contrast, the binding of a traditional ODN to a long S 
* .nh.bited by the presence of secondary structure 8 From 

^rSBCO , DN ,,OWS ' hat ? -'^^vantage of usTng 
fared SBC ODNs over a traditional ODN in hybridizing to 
H»n g S1 „gle-s.randed nucleic acid will be determined b v 

Thllt T,™ PrOV ' d,ng the S reatest advantage. 

^ to fZ ?ffP ° f RNA du " ,exes and «* ^ility of 
RNA to form C-U base pa.rs favor intramolecular base 
janng of these molecules within the cell. As a resuk 
fcmscnpts ex,st as highly folded, globular structures in S 

iv- ° DNs , can act as antisense agents (Colige et al 1 993- 
"..erry et al.. 1993). Generally. RNA folding Jogra™ 
•mot accurately predict the native structure of such ong 

3s "iSrrT each ca , n adop ' many a,temati - «Sf 

' r2*T«2n^^ U, - a,CT,Stahi,i,ies - """s has hindered 
w development of antisense technology by making the 

aecnon of single-stranded regions in mRNA a afgdy 
empirical process. '^gciy 

Mntary SBC ODNs could be used as effective antisense 

EZL o"f S T* P3ircd SBC 0DNs shou,d S 
1 L!f° ndai7 StnJcturc fe atures than a regu£ 
r^KT °° k N - Funhe ^e. SBC-RNA hybrii shouW 

«UNs to form A-type duplexes (Connolly & Newman 1989- 
Newman et al., 1990; Garriea et al I991> w- h. • • • 1 
aidies u«no «r rVrvM 7 )- We have ""dated 

Smin c 8 Ns ,0 Strand invade "Rurally occurring 

Sr ^ om U / CS RNA - Synthesis <>f SBC ODNs with 
jod.fied RNA backbones should be chemicaUy 2lZ 
bnvard and may potentiate their use as antiS 2s 
Be S ,des hybridizing ,o single-stranded nucSdXL 

9 the branch capture reaction (Quartin et al ioro- vu ■ 
sock & Wetmur loom n,- ' 89; Weu> 

, «mur. 1 990). The more general case of targeting 
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T^J^oSf from 30 end wou,d «« f«sible 

-f paired SBC ODNs can be used by the recA protein Others 

Tr 1^ " dS ° f 3 denatured restricti °n fragment to 
Je complementary sequences in a duplex substrate (Jayasena 

IEFIh ^ Se " a & ZWh » ,993) - ^ SuTam 
double D-loop .s a stable structure which survives The 
removal of recA. While this technique does not w 0 * wl 
short oligomer* complements, i, might utilize paired SBC 
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ABSTRACT 

Modified oligodeoxyribonucleotides (ODNs) that have 
unique hybridization properties were designed and 
synthesized for the first time. These ODNs, called 
selective binding complementary ODNs (SBC ODNs) f 
are unable to form stable hybrids with each other, yet 
are able to form stable, sequence specific hybrids with 
complementary unmodified strands of nucleic acid. To 
make SBC ODNs, deoxyguanoslne (dG) and deoxy- 
cytidlne (dC) were substituted with deoxylnosine (dl) 
and ^2'-<Jeoxy-(^DHlbofura 

midine-2-<3ty-one (dP), respectively. The hybridization 
properties of several otherwise identical comple- 
mentary ODNs containing one or both of these nucleo- 
side analogs were studied by both UV monitored 
thermal denaturation and non-denaturing PAGE. The 
data showed that while dl and dP did form base pairs 
with dC and dG, respectively, dl did not form a stable 
base pair with dP. A self-complementary ODN uniformly 
substituted with dl and dP acquired single-stranded 
character and was able to strand invade the end of a 
duplex DNA better than an unsubstituted ODN. This 
observation implies that SBC ODNs should effectively 
hybridize to hairpins present in single-stranded DNA 
or RNA. 

INTRODUCTION 

Oligodeoxyribonucleotides (ODNs) do not effectively hybridize 
to complementary sequences which are already base paired. 
Without the assistance of rccombinase enzymes such as recA ( 1 ), 
accessibility of ODNs to double-stranded DNA (dsDNA) is 
usually restricted to homopurine runs (2) or to extruded single- 
stranded sequences in supercoiled DNA (3). Although less of an 
issue with single-stranded DNA (ssDNA) or RNA, hybridization 
of ODNs to many sequences in these molecules can be 
compromised by intramolecular base pairing (4,5). While 
numerous hybridization strategies have been described to over- 
come or exploit secondary structure, none provides a general 
solution to the problem. Examples include modified ODNs which 
form unusually stable hybrids (6-12). ODNs which form 



triple-stranded complexes (1 3), ODNs which hybridize to 
hairpins or contiguous flanking sequences (14-18), and the use 
of 'effector' ODNs (19) and 'tethered' ODNs (20) to improve 
binding affinity through cooperative interactions. 

A pair of uniquely modified complementary ODNs (or a single 
self-complementary ODN) that do not hybridize to each other, yet 
do hybridize to unmodified complementary sequences might 
offer a general solution to the challenge of targeting any site in 
DNA or RNA. If such a pair of ODNs could be synapsed to a 
homologous region in dsDNA by recombination, a complement- 
stabilized or double D-loop (21-22) would be formed (Fig. la). 
Unlike a simple D-loop, the double D-loop is relatively stable and 
might inhibit gene expression. Alternatively, the same type of 
paired ODNs could be hybridized to a unique sequence in long, 
single-stranded nucleic acid. To the extent that sequence is 
involved in secondary structure (such as a localized hairpin; Fig. I b). 
the paired ODNs should have an advantage over a standard ODN. 
Whether such ODNs arc used as probes or antisense agents, their 
hybridization to a target should generate more new base pairs than 
an unmodified ODN. This is depicted in Figure I. 

We describe the synthesis of a cytosine (dC) analog. When 
incorporated into an ODN it rearranged to a nucleoside (dP) 
which formed 2-3 hydrogen bonds when opposite a guanosine 
(dG) and 1-2 hydrogen bonds when opposite an inosine (dl). 
When every dC and dG in a pair of complementary ODNs was 
substituted with dP and dl, respectively, the ODNs did not 
hybridize to each other yet did hybridize to unmodified, 
complementary ODNs. By this criterion, the ODNs demonstrated 
selective binding complementarity and arc designated SBC 
ODNs. Although the SBC-DNA hybrids were less stable than the 
DNA-DNA hybrid, a self-complementary SBC ODN was more 
effective than the corresponding unmodified ODN in strand 
invading a homologous duplex DNA. Further development of the 
SBC concept will depend upon the synthesis of base analogs 
which form stronger pairs with the natural complement 

MATERIALS AND METHODS 

Materials and their sources were as follows: DNA synthesis 
reagents. Glen Research; phosphodiesterase I (Crotolus adamanieus 
venom), alkaline phosphatase (calf intestinal) and DNase I, 
Amersham Life Science; T4 polynucleotide kinase (10 U/uJ), 
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Figure I. The possible applications of SBC ODNs. (a) The interaction ot SBC 
ODNs with dsDNA to form a complement stabilized D-loop in the presence of 
a recombinasc such as recA. (b)The strand invxsion of a DNA or RNA hairpin 
by SBC ODNs. In each example hybridization leads to an increase in the total 
number of base pairs, thus providing a thermodynumic drive for the reaction. 



Promega; [y- 32 PIATP, NEN Research. Commercial reagents 
were used as received. 'H-NMR spectra were determined on a 
Varian Gemini-300. ElementaJ analysis was performed by Quanti- 
tative Technologies Inc. (Whitehouse, NJ). UV spectra were 
measured on a Beckman DU-40 spectrophotometer or a Perkin 
Elmer Lamda 2S UV/VIS spectrophotometer. 

Preparation of 3-(2^eoxy-j)-D-ribofuranosyl)furano- 
[23^-pyrimidine-6(5//)-one (dF) 

5-Ethynyl-2'-deoxyuridine (3 g, 1 1 .9 mmol) (23) and copper (I) 
iodide (500 mg f 2.6 mmol) in a 250 ml two-necked 
round-bottomed flask were dried in vacuo for 3 h, placed under 
argon, and suspended in anhydrous DMF (35 ml) and triemylarnine 
( 1 5 ml). The solution was vigorously stirred at 1 20°C under argon 
and every 30 min fresh copper (0 iodide (250 mg, 1.3 mmol) was 
added until most of the starting material had reacted. After 2 h, the 
resulting mixture was filtered and the filtrate was concentrated in 
vacuo to dryness. The residue was suspended in acetone ( 100 ml) 
and stirred overnight. The desired product was filtered, washed 
with acetone (20 ml), and dried in vacuo to afford 2.2 g of dF as 
a slightly yellowish solid The remaining product in mother liquor 
was further purified by silica gel column chromatography 
(eiution solvent: 25% MeOH in EtOAc) to afford an additional 
OJ g of dF (total yield: 15 g, 83%): mp 167-168°C; UV (0.05 M 
KHP04/NaOH, pH 7) 322 nm (e 12 500). Anal, calcd for 
CuHi 2 N 2 0 5 : C, 5138; H, 4.80; N, 11.11. Found: C, 52.11; H, 
4.81; N, 10.91. 'H NMR (DMSO-dfe): the same as reported by 
Kumar et aL (24). 



Preparation of 5'-0-<4,4'-dimethoxytrity!}-dF 

dF (2. 17 g, 8.6 mmol) was dried in vacuo at 60°C overnight and 
then added to 4»4'^iimethoxytrityl-chloride (3.51 g, 10.4 mmol) 
and anhydrous triethylamine (2.4 ml) in pyridine (30 ml). After 
2 h at room temperature under argon, the resulting mixture was 
diluted with an equal volume of water and extracted with two 
150 ml portions of ether. The ether layer was dried over anhydrous 
sodium sulfate and evaporated to dryness. The residue was 
dissolved in dichloromethane (20 ml) and the desired product 
(4.6 g) was precipitated by adding the solution to 400 ml of rapidly 
stirred hexanes. Filtration yielded 4.6 g (96%) of a white solid 

Preparation of 5 -0-(4,4 -dimethoxytrityl)-dF-3 - 
0-phosphoramidite 

QUoro-[(P<yanoethoxy)-AWVKliisopfopyIan^ (2.9 g, 

12.5 mmol) was added dropwise over 30 s to an anhydrous 
mixture of 5'-0-(4 t 4'-dimethoxytrityI)-dF (4.6 g, 8.3 mmol), 
diisopropylethyl amine (5.8 ml), and dichloromethane (27 ml) 
under argon (25). After 30 min at room temperature the reaction 
was stopped by adding anhydrous methanol (0.3 ml). The reaction 
mixture was extracted with 5% aqueous NaHCC>3 (2x 15 ml) and 
saturated aqueous NaCl (2x 15 ml). The organic layer was dried 
over anhydrous sodium sulfate, filtered and then evaporated 
under reduced pressure to afford a brown oil. This crude product 
was further purified by silica gel column chromatography using 
hexanes:CH 2 Cl2:EtOAc:Et 3 N (4:3:2:1 by vol) as the solvent 
system. Fractions containing the desired product were combined, 
evaporated to dryness, and redissolved in EtOAc (10 ml). 
Precipitation from rapidly stirred hexanes (400 ml) yielded 5.9 g 
(94%) of purified material. 'H NMR (CDC1 3 ) 8 8.88 (d J = 18.6 
Hz, IH), 7.5-7.2 (m. 10H), 6.81 (m, 4H). 6.32 (m. IH), 5.62 (d 
ofd J =9.5, 3.9 Hz, lH),4.70(m, !H),4.l9(m, I H), 3.8-3.4 (m. 
I2H). 177 (m, IH), 2.59 (t, J = 9.4 Hz, 2H), 144 (m, 2H). 
1.25-1.0 (m. 12H). 

Conversion of dF to dP 

dF ( 1 g, 3.96 mmol) was dissolved in 30% aqueous ammonium 
hydroxide (30 ml). After overnight at room temperature, the 
resulting solution was concentrated in vacuo to dryness. The 
residue was suspended in acetone (50 ml), stirred overnight, and 
the undissolved product filtered to afford 850 mg. The mother 
liquor was concentrated to dryness and the residue was suspended 
in acetone (10 ml) overnight with stirring to yield an additional 
100 mg of insoluble product (total 950 mg, 95.47c). This 
compound was analyzed by HPLC, UV and NMR and shown to 
be identical to authentic 3K2'-«ieoxy-[^D-riboniranosyO 
lo-[2,3-<fl-pyrimidine-2(3//)-onc (dP). 

Synthesis and purification of ODNs 

ODNs containing modified bases were synthesized on 1 |imol 
scale using standard procedures for an ABI-394 DNA synthesizer. 
ODNs with the dimethoxytrityl group were purified by HPLC 
using a Hamilton PRP-1 (7.0 x 305 mm) reverse phase column 
employing a gradient of 5 to 45% CH3CN in 0. 1 M Et3NH + OAc~, 
pH 7.5, over 20 min with a 2 mi/min flow rate. After detritylation 
with 80% acetic acid, the ODNs were precipitated by addition of 
3 M sodium acetate and 1-butanol. The resulting ODNs were 
dried and further purified by using 20% denaturing PAGE as 
described by Hopkins et al. (26). 
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Figure 2. Base pairing schemes for dC and dG analogs. 



Enzymatic digestion of ODNs 

Enzymatic hydrolysis of ODNs was carried out as described by 
Woo etal (27). The resulting hydrolysate was analyzed by HPLC 
with dual detection at 260 nm and 320 nm (Waters 994 
Programmable Photodiode Array Detector) using a C- 1 8 reverse 
phase column (Rainin, Microsorb™Short-One®). The solvent 
gradient was run at 1 ml/min as follows: solvent A, 0.1 M 
Et3NH + OAc" pH 7.5; solvent B, CHjCN; a linear gradient 0 to 
1 3% B over 10 min, a linear gradient to 100% B over 2 min, then 
isocratic 100% B for 3 min. Peaks were identified by comparison 
of retention times to those of authentic, commercial samples (dA, 
dG, dT and dC) and synthetic samples (dF and dP) prepared by 
known procedures (28). 

Thermal denaturation data (T m ) 

T m values were recorded on a Perkin Elmer Lamda 2S UV/VIS 
spectrophotometer equipped with a temperature programmer 
(PTP-6) and interfaced to an IBM personal computer (PECSS 
software, Perkin Elmer). Scan rates were 0.5°C/min. Data were 
collected at 260 nm in the temperature range from 5 to 90°C. The 
T m is defined as the temperature at half the maximal hyper- 
chromia ty using baseline correction at high and low temperature 
extremes (29). Samples were prepared by dissolving ODNs in 
TNM buffer [ 1 0 mM Tris-HCl (pH 8.0), 0. 1 mM EDTA, 50 mM 
NaCI, 10 mM MgC^J. To ensure complete hybridization of 
complementary strands (1:1 molar ratio) before collecting data, 
the samples were incubated at 90°C for 2 min and cooled to 3°C 
over 1 h. The concentration of hybridized ODNs was approximately 
2\xM. 



OA 



d 




320 nm 



Figure 3. Reverse phase HPLC analysis: (a) enzymatic hydrolysate of Watson 
strand in II (see Table I ); (b and c) enzymatic hydrolysate of Watson strand in 
VHI (see Table 1 ); (d) authentic dP; (e) authentic dF. Detection was at 260 nm 
(a and b) or 320 nm (c. d and e). Retention times increase to the right. 



Gel migration assay 

ODNs with an asterisk!*) in Figures 5 and 6 were 5' 32 P-Iabeled 
using T4 kinase and [y^-PJ ATP (30) and present at 0.5 u,M unless 
otherwise indicated. Hybrids were formed by incubating the 
labeled ODN with a 2- fold molar excess of cold complementary 
ODN for 60 min at room temperature in 20 jil TNM buffer. These 
samples were then mixed with 20 \i\ loading butler (0.25% 
bromophenol blue, 0.25% xylene cyanol, 2.5% Ficoll type 400) 
and then kept on ice prior to gel electrophoresis. Aliquots (5 
were analyzed in a 1 2% non-denaturing polyacrylamide gel [ 19: 1 
acrylamide: bisacrylamide, 0.35 mm thick. 20 x 16 cm. 
polymerized and run in TBE buffer (89 mM Tris- borate/2 mM 
EDTA) containing 3 mM MgCl2|. Pre-electrophoresis in a 
BioRad Protean®!! xi apparatus was performed for I h at 200 V 
and I0°C. Samples were loaded, and the gel was run as before 
until the bromophenol blue dye had traveled - 15 cm (-5 h). The 
gel was dried and visualized with a phosphorimager (BioRad 
GS-250 Molecular Imager). 

RESULTS AND DISCUSSION 
Design and synthesis of SBC ODNs 

The design paradigm for SBC ODNs is modification of 
complementary dA-dT or dC-dC bases such that the modified 
bases form only one hydrogen bond when paired to each other, yet 
can form two or even three hydrogen bonds when paired to the 
natural partner. We report the synthesis of a complementary pair 
of G/C-rich SBC 28mers substituted with deoxyinosine (dl) in 
place of dG and 3K2'-deoxy-p-D-ribofuranosyl) furano-[Z3-</J- 
pyrimidine-6(5//)-one (dF) in place of dC As shown in Figure 2. 
these modified bases should form two hydrogen bonds, respectively, 
with dC (2b) or dG (2c), yet only one hydrogen bond with each 
other (2d). Although the stabilities of the SBC-DNA hybrids 
might not be as good as DNA-DNA hybrids, the SBC-SBC 
hybrids would be much less stable, thus enabling thedesign goals. 

The dG analog was simply prepared by removal of the N2 
exocyclic amino group of dG to give deoxyinosine (dl). This 
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Table 1. T m values for native and modified ODNs with dl and dP 
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Figure 4. Ultraviolet spectra of dF (upper) and dP (lower). 



nucleoside analog is known to preferentially pair with dC 
(3 1 -32). The modified dC was designed to have no hydrogen 
bonding ability at the position equivalent to the N4 exocyclic 
amino group of dC, We chose the bicyclic nucleoside dF to fulfil! 
this role. It was expected to be better than monocyclic dC analogs 
(33) because of its base stacking ability. The nucleoside dF was 
synthesized by copper (I)-catalyzed cyclization of the known 
antiviral nucleoside, 5-ethynyl-2'-deoxyuridine (23). Unlike dF 
analogs with a substituent at C2 (23, 34-35), preparation of the 
desired compound was very sensitive to solvent and reaction 
conditions; for example, if pyridine was used as a solvent the major 
product was a dimer of the starting material. dF was dimethoxy- 
tritylated and converted to its cyanoethoxy phosphoramidite by 
conventional methods (25). 

SBC and unmodified ODNs were synthesized using standard 
procedures on an AB1-394 DNA synthesizer. Based upon recoveries 
of purified products, the average coupling yield was 9 1 % for SBC 
ODNs and 94% for unmodified ODNs. The nucleoside composition 
of a representative SBC ODN was determined by reverse phase 
HPLC analysis of an enzyme hydrolysate. As shown in Figure 3, 
no peak corresponding to authentic dF was observed in the 
nucleoside hydrolysate from the modified ODN. Instead a peak 



corresponding to 3^2'^xy-|^D-riboftmu^ 
pyrimidine-2(3A/)-one (dP) was observed, suggesting that the dF 
had been converted to dP during the treatment with 30% aqueous 
ammonium hydroxide used in the final step in DNA synthesis. 

The dP isolated from the modified ODN was identical to 
authentic dP by both UV analysis (Fig. 4) and HPLC coinjection 
in two gradient solvent systems. The dF nucleoside, when treated 
overnight with 30% aqueous NH4OH at room temperature, 
rearranged to a compound (95.4% recovered yield) which had a 
H 1 NMR identical to that previously reported for dP (28). Based 
on these results we conclude that dP was the only dC analog 
detectable in the ODN hydrolysate and that >90% of the dF 
residues had been converted to dP by base treatment. Subsequent 
attempts to incorporate dP directly into ODNs using the 
phosphoramidite method were not successful due to instability 
during the iodine oxidation conditions employed in the standard 
ODN synthesis cycle. 

Although dP still could hydrogen bond at N I with the carbony I 
group at C6 of dG or dl, this interaction should be relatively weak 
due to suboptimal orientation of the Nl hydrogen (Fig. 2e and f). 
Previous studies on the base pairing properties of dP have shown 
that it preferentially pairs with dG and that this base pair is slightly 
less stable than dC-dG (28). 

Hybridization properties or SBC ODNs 

Table 1 shows the hybridization properties of 28mer ODNs 
containing dl for dG, dP for dC, or both. The sequence, taken 
from pBR322 plasmid, had a C-C content of 60.7%. Introduction 
of either dl or dP into one or both strands of the duplex decreased 
its T m by 1.8-10 or 0.4-0.7°C, respectively, per modified base 
pair. When only one strand of the hybrid was substituted with both 
dl and dP, the 7" m dropped by U-l.6°C per modified base pair. 
These values reflect a slight destabilization attributable to the 
dG-dP base pair and a larger destabilization due to the dl-dC base 
pair. When both strands of the hybrid were substituted with dl and 
dP, however, the T m drop per modified base pair increased 
significantly to 3.3°C. 

Some of the hybrids were analyzed by non-denaturing PAGE 
(Fig. 5). As shown in Table I and Figure 5, the SBC ODNs 
containing both dl for dG and dP for dC (Watson in VIII, Crick 
in DC and both Watson and Crick in X) did not form a stable hybrid 
with each other at room temperature (hybrid X; Fig. 5, lane 8), yet 
did form stable hybrids with their unmodified complementary 



2474 Nucleic Acids Research. 1996. Vol. 24. No. 13 



1 2 3 4 S 0 7 S : 9 10 11 12 13 14 




— Duplexes 



StTBWtS 



TAACAT 



Figure 5. Ge! mobility shift analysis of selected ODNs and hybrids listed in 
Table 1 . 32 P end-labeled ODNs are denoted by an asterisk (• ) and Watson/Crick 
strands by WtC (see experimental section for details). Single-stranded ODN 
was analyzed in lanes I (*W in I), 3 <»C in IV). 5 t*C in VII), and 7 (*C in X). 
Complementary ODNs were analyzed in lanes 2 CW/C in I). 4 (W/*C in IV). 
6 (W/*C in VII), 8 <W/*C in X). 9 <*W/C in III). 10 (*W/C in VI). and II 
(•W/C in IX). For lane 12 (W/*C in IX + C in I; molar ratio 2:1:3). the 
preformed hybrid DC was treated at room temperature for 60 min with 
unmodified Crick. For lane 1 3 ( W in I + C in I + *C in IX; molar ratio 1:1:1). 
unmodified Watson ( I U.M) was mixed simultaneously with unmodified Crick 
and SBC Crick and incubated at room temperature for 60 min. For lane 14 ( W/C 
in VHI plus W/'C in IX: molar ratio 2: 1 :2:i ). the hybrid VIII was mixed with 
the hybrid IX and incubated at room temperature overnight. 



ODN strands (hybrid IX; Fig. 5. lane 1 1 ). As a result, these ODNs 
exhibited selective complementary binding. Respite the reduced 
stability of hybrids formed between SBC and normal ODNs, the 
normal Watson strand showed no preference for the normal Crick 
over the SBC Crick strand when equimolar of these three strands 
were mixed simultaneously at room temperature; about equal 
amount of duplexes I and IX were formed (Fig. 5. lane 13). 
Additionally, there was little, if any, strand displacement or strand 
exchange when the pre-formed DNA-SBC duplex IX was 
incubated with the normal homoiog of the SBC strand or with 
SBC-DNA duplex VIII; not much single-stranded SBC was 
formed (Rg. 5, lanes 12 and 14). These data clearly demonstrate 
that the SBC ODNs described above behaved like natural ODNs 
when hybridized to unmodified complements, yet did not form 
stable hybrids with themselves. 

Strand invasion of a DNA duplex 

To determine whether an SBC ODN could strand invade dsDNA, 
a 17 base pair segment of hybrid X was synthesized as a single 
self-complementary ODN (XIII; Fig. 6A). Linking the comple- 
mentary domains into one ODN was expected to improve the 
kinetics of strand invasion and the stability of the product. The 
chimeric SBC ODN had a T m of 3 1 °C and hybridized to a partial 
DNA complement (Watson in XI) at room temperature (Fig. 6B, 
lane 4). The corresponding unmodified ODN (XII), derived from 
hybrid I, had a T m of 80°C and hybridized poorly to the same 
DNA complement (Fig. 6B, lane 3). A 48 bp duplex (XI) with one 
end homologous to the self-complementary ODNs was used as a 
substrate for strand invasion. Annealing was facilitated by the 
presence of two 5 base long single-stranded overhangs in XI 
which could hybridize to complementary four base long over- 
hangs in the invading ODNs. The duplex can be likened to the 
stem of a hairpin that might exist in a long ssDNA. After 4 h at 
room temperature, a 1 0-fold excess of XIII converted 73% of XI 
to a three-way junction compared with 17^ for XII (Fig. 6B, lanes 
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Figure *. Gel mobility shift analysis of the strand invasion properties oT normal 
and SBC self<omplemenlary ODNs. (A) Sequences of the dsDN A duplex (XI ) 
and the strand invading normal (XII) and SBC (XIII) self-complementary 
ODNs. Employing the convention of Tahle I . the upper and lower strands of X I 
are Crick and Watson, respectively. (B) Gel mobility shift analysis of stnind 
invasion. Reactions were incubated 4 h at room temperature in TNM buffer 
prior to elctfmnruircsis. Unless otherwise indicated, hybrids were formal by 
mixing the labeled ODN (0.1 fiM) with a 2-fold molar excess of cold 
complementary ODN. Single-strand (*W in XI) and double-strand (*W/C in 
XI) standards were run in lanes I and 2. Hybridization reactions between free 
* W strand of duplex XI and self-complementary ODNs XII or XIII (molar ratio 
1:10) were analyzed in lanes 3 and 4. Strand invasion reactions between duplex 
XI ( *W/C) and self-complementary ODNs XII or XIII (molar ratio 1:10) were 
analyzed in lanes 5 and 6. 



6 and 5). Ongoing studies indicate that the self-complementary SBC 
ODN has a significant kinetic advantage over the unmodified 
ODN. 

Conclusions 

Based on thermal denaturation and non-denaturing gel mobility 
shift assays, we have designed and synthesized for the first time 
modified ODNs which exhibit selective complementary hybridiza- 
tion. A self-complementary version of one of these paired ODNs 
strand invaded a homologous double-stranded DNA better than 
the corresponding unmodified ODN. The possible diagnostic and 
therapeutic uses of these ODNs are being explored. Efforts to 
improve the hybridization properties of SBC ODNs including the 
modification of dA and dT arc also underway. 

ACKNOWLEDGEMENTS 

We thank Drs I. V. Kutyavin, A. Gall, V. Gorn and E. Lukhtanov 
for helpful discussions. We thank Mr D. Adams and Ms A. Yang 
for technical contributions. 



Nucleic Acids ?arvh. 1996. Vol. 24, No. 13 2475 



REFERENCES 

1 Cheng S . Van Houten.B.. Gamper.H.B., SancarA and Hearst J.E. ( 1 988) 
j Biot'chem.. 263, 15110-151 17. 

2 Helcnc.C ( 1 993) In CrookcSX and Lebleu.B. (cds) Antisense Research 
and Applications. CRC Press. Boca Raton, pp. 375-385. 

3 lyer.M. NonorJ.C and Corey.D.R. (1995) J. BioL Chem.. 270. 
I4712-I47I7. 

4 Gampcr.H.B.. Cimino,G.D. and HearsU.E. ( 1987) J. Mot BioL 197. 
U9-362. 

* Chastain.M. and Tinoco.Ur. ( 1993) In CrookcS.T. and Lebleu.B. (eds) 
Antisense Research and Applications. CRC Press. Boca Raton, pp. 55-66. 

t, SproatB.S. and LamondAI. ( 1 993) In CrookcS.T. and Lebleu.B. (eds) 
Antisense Research and Applications. CRC Press, Boca Raton, pp. 
352-262. 

7 Monia.B.P.. Lesnik.EA., Gonzatez.C, Lima.W.F.. MeGee.D., 
Guinosso.CJ.. Kawasaki .A.M.. Cook.P.D. and Frcier5.M. (1993) / Biol. 
Chem..XA. 14514-14522. 

8 Gryaznov.S. and ChcnJ.-K. ( 1994) / Am. Chem. Sue.. 116. 3143-3144. 

9 NielscruPE. Egholm.M. and Buchardt.O. ( 1 994) Bifxonjuxate Chem.. 5. 

10 Lamm.G.M.. Blencowe.BJ.. Sproat.B.S., lribarrcn.A.M.. Ryder.U. and 
Lamond.A.1. (1991) Nucleic Acids Res.. 19. 3193-3198. 
Wagner.R.W.. Matteucri.M.D.. LewisJ.G.. Gutierrez^ J.. Moulds C and 
Froehler.B.C. ( 1993) Science. 260. I5IO-I5I3. 

12 Asseline.U.. DelanicM.. Lancelot,G.. Toulme.F.. Thuong.NX. 
Monicnay-Garcstier.T. and Helcnc.C (1984) Proc. Natl. Acad. Set. USA. 
81.3297-3301. 

1 3 Wang,S. and Kool.ET. (1994)/ Am. Chem. Soc.. 1 1 6. 8857-8858. 

14 Uma.W.F.. Monia.B.R. Eckcr.DJ. and Freier^.M. (1992) Biochemistry. 
31. 12055-12061. 

15 Ecker.DJ.. Vickcrs.T.A.. Bnjice.TW^ Freier.S.M., Jenison.R.D.. 
Manoharan.M. and Zounes.M ( 1992) Science. 257. 958-961. 

16 FrancoisJ.-C. Thuong.NX and Hilene.C ( 1994) Nucleic Acids Res.. 22. 
3943-3950. 



17 Brossalina.E and ToulmeJ.-J. (1993) J. Am. Chem. Soc., 115, 796-797. 

1 8 Francois J.C and H€ lene.C. ( 1 995) Biochemistry. 34. 65-72. 

19 Kutyavtn.I.V. Podyminogin>t-A.. Bazhina,Y.N.. Fedon>va,O.S„ 
Knorre£>.G., Levina^-S., Mamayev.S.V. and Zarytova,V.F. (1988) FEBS 
Urt.. 238, 35-38. 

20 Richardson.P.L. and ScheparuA ( 1991 ) / Am. Chem. Soc., 113. 
5109-51 II. 

21 JayaseruuVK. and Johnston.B.H. (1993) / Mol BioL 230. 1015-1024. 

22 Scna.EP. and Zarting.D.A. ( 1 993) Nature Genetics, 3. 365-372. 

23 RobinsJtfJ. and Barr.PJ. ( 1983) / Org. Chem.. 48. 1854-1862. 

24 Kumar.R.. Knaus.EE and WicbeX.I. ( 1991 ) / Heterocyclic Chem.. 28. 
1917-1925. 

25 SinhaXD., BiemaU., McManus J. and KosterX ( 1 984) Nucleic Acids 
Res.. 12. 4539-4557. 

26 Hopkins.P.B.. MillardJ.T.. WooJ.. Weidner.M.F.. KirchnerJ J.. 
Sigurds.son.SX and Raucher.S. (1991 ) Tetrahedron. 47. 2475-2489. 

27 WooJ.. Sigurdsson.SX and Hopkins,P.B. (1993) / Am. Chem. Stxr.. 115. 
3407-3415. 

28 Inoue.H.. ImuraA. and Ohisuka.E. ( 1 987) Nippon Ka^aku Kaishi. 7. 
1214-1220. 

29 Albergo.D.D, Markcy.L. A.. Brcslauer.K J. and Tumer.D.H. (1981) 
Biochemistry; 20. 1409-1416. 

30 ManiatisX. Fritsch.EF. and Sambrook J. ( 1 982) Molecular Cloning: A 
Laboratory Manual. Cold Spring Harbor Laboratory. Cold Spring Harbor. 
NY. pp. 125-126. 

3 1 Casc-Grcen^.C. and SouihenuEM. ( 1994) Nucleic Acids Res.. 22, 
131-136. 

32 Manin.F.H.. Castro.M.M.. Aboul-elajv and Tinoco.I. ( 1985) Nucleic Acids 
Res.. 13. 8927-8938. 

33 Gildea.B. and McLaughlin.L.W. ( 1989) Nucleic Acids Res.. 17. 
2261-2281. 

34 Robins.M J.. Vinayak.R.S. and Wood.S.G. ( 1 990) Tetrahedron Lett.. 31. 
3731-3734. 

35 Crisp.GX and Flynn.B.L. (1993) / Or^. Chem.. 58. 6614-6619. 







United States Patent [w] 

Kutyavin et al. 



[ii] 

[45] 



US005912340A 
Patent Number: 
Date of Patent: 



5,912,340 
Jun. 15, 1999 



[54] SELECTIVE BINDING COMPLEMENTARY 
OLIGONUCLEOTIDES 

[75] Inventors: Igor V. Kutyavin, Botbell; Jinsuk 
Woo, Lynnwoodc; Eugeny A. 
Lukhtanov; Rich B. Meyer, Jn, both 
of Bothcll; Howard B. Gamper, 
Woodinville, all of Wash. 

[73] Assignee: Epoch Pharmaceuticals, Inc^ Bothell, 
Wash. 

[21] Appl. No.: 08/539,097 
[22] Filed: Oct. 4, 1995 

[51] Int. CI. 6 C07H 2V04 

[52] VS. CL 536/24.5; 536/26.6; 536/243 

[58] Field of Search 536/243, 26.6, 

536/24.3 

[56] References Cited 

FOREIGN PATENT DOCUMENTS 

WO93/03736 3/1993 WIPO - A61K 31/70 

95/05391 2/1995 WIPO . 

95/14707 6/1995 WIPO C07H 19/16 

OTHER PUBLICATIONS 

Scheit et al. Studia Biophysica vol. 55, No. 1, 1976. pp. 
21-27. 

Inoue et al. Chemical Abstracts vol. 108, No. 21, 1988. p. 
752, col. 1. 

Database WPI, AN 87-352165[50] (see abstract) & Patent 
Abstracts of Japan, vol. 012, No. 139 (see abstract). 
Case-Green et al. Nucleic Acids Research, vol. 22, No. 2, 
1994. pp. 131-136. 

Martin et al. Nucleic Acids Research, vol. 13, No. 24, 1985. 
pp. 8927-8938. 

Chollet et al. Nucleic Acids Research, vol. 16, No. 1, 1988. 
pp. 305-317. 

Kuimelis et al. Nucleic Acids Research, vol. 22, No. 8, 1994. 
pp. 1429-1436. 

Ishikawa et al. Chemical Abstracts, vol. 116, No. 13, 1992. 
pp. 949. col. 2. 

Newman et al. Biochemistry, vol. 29, No. 42, 1990. pp. 
9891-9901. 

Richardson et al. Journal of the American Chemical Society, 
vol. 113, No. 13, 1991. pp. 5109-5111. 
Woo J et al. Nucleic Acids Research, vol. 24, No. 13, 1996. 
pp. 2470-2475.2. : 

Strobel, S.A., et al. (1991) Science, 254:1639. 
Weinstock, P., et al. (1990) NucL Acids Res., 18:4207. 
Roca, A.I., et al. (1990) Rev. Biocherru Mol Biol, 25:415. 
Robins et al.(1982) Can. J. Chem., 60:554. 



Robins et al.(1983)7. Org. Chem., 48:1854. 
Meyer, et al., (1989)7. Am. Chem. Soc. f 111:8517. 
Kobayashi (1973) Chem. Phar. Bull., 21:941-951. 
Sonveaux (1986) Bioorganic Chemistry, 14:274-325. 
Jones (1984) "Oligonucleotide Synthesis, a Practical 
Approach", M J. Gait, Ed., IRL Press, P23-34. 
Langer et al. (1981) Proa Natl Acad. ScL USA, 
78:6633-6637. "Nucleic Acid Hybridisation, a Practical 
Approach", Hames and Higgins, Eds., IRL Press, 1985. 
Gall and Pardue, (1969) Proc. Natl Acad. Set., USA., 
63:378-383. 

John et al., (1969) Nature, 223:582-587. 

"Physical Biochemistry", Freifelder, D., W.H. Freeman & 

Co., 1982, pp. 537-542. 

Tijsscn, P., (1985) "Practice and theory of Enzyme Immu- 
noassays, Laboratory Techniques in Biochemistry and 
Molecular Biology", Burdon, R.H., van Knippenberg, PH., 
Eds, Elsevier, pp. 9-20. 

Connolly, et al., (1989) Nucleic Acids Res. ,17:4957-4974. 
Fathi, et al. (1990) Tetrahedron Lett, 31:319-322. 
Sinha, et al. (1984) Nucleic Acids Research, 12:4539. 
Alul, et al. (1991) Nucleic Acids Res., 19:1527-1532. 
Atkinson, T. and Smith, M. (1984) "Oligonucleotide Syn- 
thesis, a Practical Approach", M. Gait, Ed., IRL Press, 
Washington, D.C., pp. 35-81. 

Primary Examiner— Scott W. Houtteman 
Attorney, Agent, or Firm—Klein & Szekercs, LLP 



[57] 



ABSTRACT 



In a matched pair of oligonucleotides (ODNS) each member 
of the pair is complementary or substantially complementary 
in the Watson Crick sense to a target sequence of duplex 
nucleic acid where the two strands of the target sequence are 
themselves complementary to one another. The ODNs 
include modified bases of such nature that the modified base 
forms a stable hydrogen bonded base pair with the natural 
partner base, but does not form a stable hydrogen bonded 
base pair with its modified partner. This is accomplished 
when in a hybridized structure the modified base is capable 
of forming two or more hydrogen bonds with its natural 
complementary base, but only one hydrogen bond with its 
modified partner. Due to the lack of stable hydrogen bonding 
with each other, the matched pair of oligonucleotides have 
a melting temperature under physiological or substantially 
physiological conditions of approximately 40° C. or less. 
However each of the matched ODN pair of the invention 
forms a substantially stable hybrid with the target sequence 
in each strand of the duplex nucleic acid. The hybrids of 
target duplex nucleic acids formed with the ODN pairs of the 
invention are useful for gene mapping and in diagnostic and 
therapeutic applications, 

26 Claims No Drawings 
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A sufficient number of the modified SBC nucleotides are 
40 incorporated such that complementary positions in both 
SBC ODNS are modified into a matched pair of SBC ODNs 
of the present invention so that the pair of the matched set 
does not form a stable hybrid; in other words under physi- 
ological conditions it has a melting temperature of approxi- 

45 mately 40° C. or less. It is not necessary to replace each 
natural nucleotide of the ODN with a modified SBC nucle- 
otide in order to accomplish this. Both members of the 
matched pair are however complementary to a target' 
sequence in double stranded or duplex nucleic acid, where 

50 the two strands or parts of the target duplex are themselves 
complementary or substantially complementary to one 
another. As it is described in more detail below, an important 
use of the SBC ODNs of the present invention is hybrid- 
ization with secondary structure of mRNA wherein the 

55 mRNA itself forms a duplex, such as in hairpin loops. It is 
known that secondary structure of mRNA and ribosomal 
RNA do not have two strands in the strict sense of that term. 
Nevertheless, unless the context otherwise indicates, in the 
present description the terminology "two strands" of double 

60 stranded nucleic acids also refers to the two complementary 
portions of duplex mRNA or of duplex ribosomal RNA as 
well. The general concept of double stranded DNA and of 
secondary structure in mRNA and ribosomal RNA is cov- 
ered in this description by the term "duplex nucleic acid". 

65 The term "RNA" can apply to any functional RNA in living 
organisms, such as messenger, transfer, ribosomal, small 
nuclear, guide, genomic, etc. RNA. 
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R4 is H. alkyl, C a _ 6 alkenyl, C^ alkynyl, or option- 
ally the 5-position of the pyrimidine serves as point of 
attachment for a cross-linking function, or a reporter 
group as described below. A preferred embodiment of 
the SBC nucleotide T has 2-thio-4-oxo-5- 5 
methylpyrimidine (2-thiothymine) as the base, as 
shown in Formula 2b. The latter nucleotide is abbre- 
viated as 2-sT or d 2-sT as applicable. 

30 

Formula 8 

O 




15 



20 



O R 
I 

Y = P — Z 
I 



O 



A general structure for a preferred class of the modified G 
analog, G\ within the scope of the invention and shown as 
a 3'-phosphate (or phosphorothioate) incorporated into the 30 
SBC ODN, is provided by Formulas 9, 10 and 11, wherein 

Ri is H, alkyl, C^ alkoxy, C 2 . 4 alkylthio, F or NHR 3 
where R 3 is defined as above, 

X, Y, Z and R are defined as above, and the 8 position of 35 
the purine, the 3 position of the pyrrazolopyrimidine or 
the 5 position of the pyrrolopyrimidine optionally serve 
as point of attachment for a cross-linking agent, or 
reporter group as described below. A preferred embodi- 
ment of the SBC nucleotide G* has 6-oxo-purine 40 
(hypoxanthine) as the base, as shown in Formula 3b. 
The latter nucleotide is abbreviated as I or dl as 
applicable. 

Formula 9 45 

O 




50 



55 



O R 



Y=P — Z 
I 

O 
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-continued 



Formula 10 

O 




O R 
I 

y=p — z 
I 

o 



Formula 11 

O 




O R 



Y = P — Z 
I 

O 

I 



A general structure for a preferred class of the modified C 
analog, C, within the scope of the invention and shown as 
a 3'-phosphate (or phosphorothioate) incorporated into the 
SBC ODN, is provided by Formulas 12 and 13, wherein 

Y, Z, R and R 4 are defined as above, or optionally the 
5-position of the pyrimidine serves as point of attach- 
ment for a cross-linking function, or a reporter group as 
described below; 

Z 1 is O or NH, and 

R 5 is H or C lri4 alkyl. 

A preferred embodiment of the SBC nucleotide C has 
pyrrolo-[23-d]pyrimidine-2(3H)-one as the base, as shown 
in Formula 4b. The latter nucleotide is abbreviated as P or 
dP as applicable. 
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-(CH^-NH-CO-CCH^-^X^-N^)- 
(CHJjp— L and 

-{CH^— OMCH,) —NH-CCMCH^ -^(X*)„ 

where q, m and L are defined as above, q' is 3 to 7 inclusive, 
q" is 1 to 7 inclusive, X* is phenyl or simple substituted 
phenyl (such as chloro, bromo, lower alkyl or lower alkoxy 
substituted phenyl), n is 0 or 1, p is an integer from 1 to 6, 
and Rj is H, lower alkyl or (CHJ p — L. Preferably p is 2. 



building unit of the SBC ODN by a carbon-to-carbon bond. 
Generally speaking, 5-substituted-2'-deoxyuridines can be 
obtained by an adaptation of the general procedure of 
Robins et al. (Can. J. Chem., 60:554 (1982);/. Org. Chem., 
48:1854 (1983)), as shown in Reaction Scheme 1. In accor- 
dance with this adaptation, the palladium-mediated coupling 
of a substituted 1-alkyne to 5-iodo-2*-deoxyuridine gives an 
acetylene-coupled product. The acetylenic dUrd analog is 
reduced, with Raney nickel for example, to give the satu- 



~- l — - — - — • V~- Z7p * *"* "* f " *** * . * » — O 

Those skilled in the art will recognize that the structure 10 ratc d compound, which is then used for direct conversion to 

— N(Rj) — (CH^j — L describes a "nitrogen mustard", a reagent for use on an automated DNA synthesizer; as 

which is a class of potent alkylating agents. Particularly described below. In Reaction Scheme 1, q is defined as 

preferred within this class of BBC ODNs of the invention above, and Y is either Y* (as defined above) or is a suitable 

are those where the cross-linking agent includes the func- protected derivative of Y*. Y can also be defined as a group 

tionality — N(RJ — (CH^ — L where L is halogen, prefer- " which terminates in a suitably protected nucleophilic 

ably chlorine; and even more preferred within this class are f^ion, such as a protected amine. Examples of reagents 

those modified SBC ODNs where the cross linking agent ™ hich ca ° be coupled 10 5-iodo-2'-deoxyuridine in accor- 

includes the grouping -N-RCH^-L], (a "bifunctional" daDCC wth ™ 

N-mustard). HC=CCH 2 OCH 2 CH 2 N(CO) 2 C 6 H 4 

A particularly preferred partial structure of the cross 20 (phtalimidoethoxypropyne), 
linking agent includes the grouping 



-»CO-(CH 2 )3— C 6 H 4 --N-[(CH 2 ) 2 a] 2 . 
In a particularly preferred embodiment the just -noted cross- 
Unking group is attached to an n-hexylamine bearing tail at 
the 5* and 3' ends of the SBC ODN in accordance with the 
following structure: 

R'— O— (CH 2 ) 6 — NH— CO— (CH 2 ) 3 — C 6 H 4 — N— 
[(CH^CIL 

where R' signifies the terminal 5* or 3 f — phosphate group of 
the SBC ODN. 

Other examples for the A* — L group, particularly when 
attached to a heterocyclic base in the oligonucleotide (such 
as to the 5-position of 2'-deoxyuridine) are 

3- iodoacetamidopropyl, 3-(4-bromobutyramido)propyl, 

4- iodoacetamidobutyl and 4-(4-bromobutyramido)butyl 
groups. 

In accordance with other preferred embodiments, the 
cross-linking functionality is covalently linked to the het- 
erocyclic base, for example to the uracil moiety of a 
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30 



35 



HCsCCH 2 OCH : CH 2 NHCOCF 3 

(trifluoroacetamidoethoxypropyne), 
HteCCH 2 N(CO) 2 C 6 H 4 (phtalimidopropyne) and 
HG==CCH 2 NHCOCF 3 (trifluoroaceUmidopropyne), 

In these examples the nucleosides which are obtained in 
this scheme are incorporated into the desired SBC ODN, and 
the alkylating portion of the cross-linking agent is attached 
to the terminal amino group of "Y" only after removal of the 
respective phtalic or trifluoroacetyl blocking groups. 

Another particularly preferred example of an "arm- 
leaving group combination" (A* — L) is attachment of a 
nitrogen-mustard type alkylating agent (or other alkylating 
agent) to the amino function of a 5-(3-aminopropyl)-2'- 
deoxyuridine building unit of the SBC ODN. The appropri- 
ate nucleotide building unit for ODN synthesis which 
includes the 5-(3-aminopropyl)-2 , -deoxyuridine nucleoside 
moiety can be obtained in analogy to Reaction Scheme 1, 
and in accordance with the teaching of Meyer et al., J. Am. 
Chem. Soc. 1989, 111, 8517. In this particularly preferred 
embodiment the nucleotide having the 5-(3-arninopropyl)- 



Z-deoxyuridylic acid building block of the SBC ODN. The 40 2'-deoxyuridine moiety is incorporated into the SBC ODN 



linkage can occur through the intermediacy of an amino 
group, that is, the "arm-leaving group combination" (A* — 
L) may be attached to a 5-amino-2'-deoxyuridylic acid 
buUding unit of the SBC ODN. In still other preferred 
embodiments the "arm -leaving group combination" (A* — 
L) is attached to the 5-position of the 2'-deoxyuridyIic acid 



by routine synthesis, and the cross-linking function is intro- 
duced by reacting the BBC ODN with an activated form of 
a "nitrogen mustard", such as 2,3,5 ,6-telrafluo^ophenyl-4 , - 
[bis(2-chloroethyl)amino]phenylbutyrate (Chlorambucil 
2,3,5,6-tetrafluorophenyl ester; chlorambucil itself is com- 
mercially available). 



Reaction Scheme 1 




HC=C (CH^-Y 

Pd(Q) 



■xV 



cr n 
I 

dR 



C=C — (CH^j-Y 
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A moiety containing the leaving group, such as a haloacyl pound. Ligands and antiligands may be varied widely, 

group (CO — CH 2 — L where L is halogen for example I) or Where a ligand has a natural "antiligand", namely ligands 

— CO — (CH^) m — (X*)„ — N(Rj) — (CHJlp — L group (even such as biotin, thyroxine, and Cortisol, it can be used in 

more preferably a conjunction with its labeled, naturally occurring antiligand. 

CO— (CH^ — C tf H 4 — N— {CHjCr^Cl] J may be added to 5 Alternatively, any haptenic or antigenic compound can be 

the aminoalkyl or like groups ( — CH^ — Y*) following used in combination with a suitably labeled antibody. A 

incorporation into oligonucleotides and removal of any preferred labeling method utilizes biotin-labeled analogs of 

blocking groups. For example, addition of an oligonucleotides, as disclosed in Langer et al., Proc Natl. 

ct-haloacetamide may be verified by a changed mobility of Acad. ScL £75478:6633-6637 (1981), which is incorporated 

the modified compound on HPLC, corresponding to the herein by reference. 

removal of the positive charge of the amino group, and by 10 Enzymes of interest as reporter groups will primarily be 

subsequent readdition of a positive charge by reaction with hydrolases, particularly phosphatases, esterases, ureases and 

2-aminoethanethiol to give a derivative with reverse phase glycosidases, or oxidoreductases, particularly peroxidases. 

HPLC mobility similar to the original aminoalkyl- Fluorescent compounds include fluorescein and its 

oligonucleotide. derivatives, rhodamine and its derivatives, dansyl, 

In the situations where the cross linking agent (A*— L 15 wnbeUiferone, rare earths, etc. Chcmiluminescers include 

moiety) is attached to the 3' or 5* terminus of the '^/"i 11 ' «ridinium esters and 2,3- 

oligonucleotide, for example by an alkylamine linkage of dihydrophthalazmediones e.g., luminol. A further desenp- 

the formula -{CH^-Y* (Y* terminating in an amine), < 10n ^^P 0 ^ 8 ^^ SSSf^™ 1 ^ * ercof f ca ?.^ 

f , . „„r„ * j tn c„t v iJa found in VS. Pat. No. 5,419,966, the specification of which 

te ohgonuc eoude synthesis may be Performed to first yield e incorporated herein by reference, 

the oligonucleotide with said aminoalkyl tail, to which then ^ ^ hy ^ ridizatioD co^ns are not critical and 

an alkylating moiety, such as the atove-not^ haloacylgroup ^ y ^ ^ the fcvesifeatort preferences 

/rS\i~- a — c °— ( CH 2)»— 1^ )*— NCKj}— and ^ds. -T^g particular hybridization technique is not 

(CH^ L is introduced. essential to the invention. Hybridization techniques are 

SBC ODNs bearing a reporter group, lipophilic group or tail generally described in "Nucleic Acid Hybridization, A Prac- 

As is known in the art a "reporter group" can be broadly uca l Approach", Hames and Higgins, Eds., IRL Press, 1985; 

defined as a group that is incorporated in, or is attached to 25 Gall and Pardue, Proc. Natl Acad. ScL, USA., 63:378-383 

an ODN and which renders detection or isolation of the (1969); and John et al., Nature, 223:582-587 (1969). As 

ODN possible by application of some analytical, physical, improvements are made in hybridization techniques, they 

chemical or biochemical method. Generally speaking can readily be applied. 

reporter groups are attached to ODNs ben the ODNs are The amount of labeled probe which is present in the 

used as probes. In terms of attaching reporter groups to 30 hybridization solution may vary widely. Generally, substan- 

ODNs in the general sense, the art is well developed and is tial excess of probe over the stoichiometric amount of the 

recited here only in a summary fashion. The SBC ODNs of target duplex nucleic acid will be employed to enhance the 

the present invention having a reporter group (such as a rate of binding of the probe to the target sequence, 

radiactive label) attached, can be utilized substantially in After hybridization at a temperature and time period 

accordance with state-of-the-art hybridization technology, to 35 appropriate for the particular hybridization solution used, 

detect specific target sequences in duplex regions of nucleic the glass, plastic, or filter support to which the probe-target 

acids. The advantage of the SBC ODNs of the present hybrid is attached is introduced into a wash solution typi- 

invention, as compared to the prior art, is that the SBC ODN cally containing similar reagents as provided in the hybrid- 

of the present invention can effectively invade and bind to ization solution. Either the hybridization or the wash 

the duplex nucleic acid sequence. medium can be stringent. After appropriate stringent 

Thus, probes may be labeled by any one of several 40 washing, the correct hybridization complex may now be 

methods typically used in the art. A common method of detected in accordance with the nature of the label, 

detection is the use of autoradiography with 3 H, 325 I, 35 S, The probe may be conjugated directly with the label. For 

14 C, or 32P labeled probes or the like. Other reporter groups example, where the label is radioactive, the support surface 

include ligands which bind to antibodies labeled with with associated hybridization complex substrate is exposed 

fluorophores, cbemiluminescent agents, and enzymes. 45 to X-ray film. Where the label is fluorescent, the sample is 

Alternatively, probes can be conjugated directly with labels detected by first irradiating it with light of a particular 

such as fluorophores, chemiluminescent agents, enzymes wavelength. The sample absorbs this light and then emits 

and enzyme substrates. Alternatively, the same components light of a different wavelength which is picked up by a 

may be indirectly bonded through a ligand-antiligand detector ("Physical Biochemistry", Freifelder, D., W. H. 

complex, such as antibodies reactive with a ligand conju- 50 Freeman & Co., 1982, pp. 537-542). Where the label is an 

gated with label. The choice of label depends on sensitivity enzyme, the sample is detected by incubation with an 

required, ease of conjugation with the probe, stability appropriate substrate for the enzyme. The signal generated 

requirements, and available instrumentation. may be a colored precipitate, a colored or fluorescent soluble 

The choice of label dictates the manner in which the label material, or photons generated by bioluminescence or 

is incorporated into the probe. Radioactive probes are typi- 55 chemiluminescence. The preferred label for dipstick assays 

cally made using commercially available nucleotides con- generates a colored precipitate to indicate a positive reading. 

Laining the desired radioactive isotope. The radioactive For example, alkaline phosphatase will dephosphorylate 

nucleotides can be incorporated into probes, for example, by indoxyl phosphate which then will participate in a reduction 

using DNA synthesizers, by nick-translation, by tailing of reaction to convert tetrazolium salts to highly colored and 

radioactive bases in the 3' end of probes with terminal insoluble formazans. 

transferase or the 5'-end with a polynucleotide kinase. 60 Detection of a hybridization complex may require the 
Non-radioactive probes can be labeled directly with a binding of a signal generating complex to a duplex of target 
signal (e.g., fluorophore, chemiluminescent agent or and probe polynucleotides or nucleic acids. Typically, such 
enzyme) or labeled indirectly by conjugation with a ligand. binding occurs through ligand and antiligand interactions as 
For example, a ligand molecule is covalently bound to the between a ligand-conjugated probe and an antiligand con- 
probe. This ligand then binds to a receptor molecule which 6S jugated with a signal 

is either inherently detectable or covalently bound to a The label may also allow indirect detection of the hybrid- 
detectable signal, such as an enzyme or photoreactive com- ization complex. For example, where the label is a hapten or 
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-continued 

DMTiO— | Q N^V^NH-COCH.OPh DMTrO i q N V^NH-COCH^Ph 

* CNCH2CH20P(a)N(iPro)2 
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Reaction Scheme 3 




The nucleotide moiety shown in Formula 4b is obtained 1854. This compound is dimethoxytritylated and converted 

by the method illustrated in Reaction Scheme 4 which into the corresponding cyanoethoxy phosphoramidite 

substantially follows known chemical literature. First the 60 (Compound 12) suitable as a reagent for ODN synthesis, 

"furan" analog deoxyribofuranoside, namely 3-(2*-deoxy-p- substantially by conventional literature methods (see Sinha 

D-ribofuranosyl)furano-[2,3-d]pyrimidine-6(5H)-one et al. Nucleic Acids Research. 1984, 12, 4539). The SBC 

(Compound 11) is synthesized by copper (I)-catalyzed ODNs of the present invention are then constructed on a 

cyclization from the known antiviral nucleoside 5-ethynyI- 65 solid support. The final step of treating the SBC ODN with 

2'-deoxyuridine (Compound 10), substantially as in the ammonia to remove protecting groups, converts the furano- 

literature procedure of Robins et al. J. Org. Chem. 1983, 48, [2,3Kl]pyriimdine-6(5H)-one Dase into tD6 pyrrolo-[23-d] 
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not with the complementary SBC ODN. Thus, members of 
a matched pair of SBC ODNs were found to form stable 
hybrids with their respective natural complementary targets, 
but not with each other. Table 1 below indicates the melting 
temperatures observed under the conditions indicated in the 
table, and also the calculated decrease (drop) in melting 
temperature per modified base pair. 
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their unmodified complementary strands, while they do not 
form stable hybrids with themselves. 

Table 2 refers to a complementary pair of 20-mer oli- 
godeoxyribonucleotides (ODN V and ODN VI) which are 
hybridized under substantially physiological conditions 
(0.2M Nad, 0.01M Na 2 HP0 4 , 0.1 mM EDTA, pH7 0 ODN 
concentration-4xlO- 7 M). The ODNs designated in fable 2 



TABLE 1 


Table 1. Tm Values for Native and ModiBcd ODNs with dl and dP 


Wataon: 5" XTY AXA AXY ATX YYA YYA XXY AAY 
Cnck: 3' VAX TYT TYX TAY XXT XXT YYX TTX 


YAY X 3* 
XTX Y 5* 


Watson Crick 
Hybrid X Y X Y 


Tm Drop per 
TmfC)* Modified Base Pair 


Sequence ID NO: 1 I C G C G 
Sequence ID NO: 3 II P I C G Sequence ID NO: 2 
Sequence ID NO: 7 m C G PI Sequence ID NO: 4 
Sequence ID NO: 8 IV p I PI 


75.6 0 
48.2 1.61 
57.2 1.08 
20.2 3.26 


•10 mM Tris-Hd (pH 8.0X 0.1 mM EDTA, 50 mM Nad, 10 mM MgCl 2 



ohgodeoxynucleotides wherein X and Y are natural dC and 15 re P laced ^ lhc and d2sT, respectively. 

dG residues in both ODNs. Thus, Hybrid 1 provides a ^ meltin S lemperatures of these pairs are indicated in the 

reference, to which other hybrids formed of modified SBC Table. 

S u ?, C ?w b . C ^P** 4 P a ir of SBC ODNs shown as 

Hybnd IV in Table 1 comprises two 28-mer sequences 30 

where each of the natural dG and dC nucleotides is replaced 

with dl and dP, respectively. Hybrid IV is unstable with a 



TABLE 2 



Sequence ID NO: 5 — ___ _ 

ODN V 5 , -GTAAGAGAATTArcCAGTGC-3' 
Sequence ID NO: 6 

0DN V 1 3'-CATTCrCTTAATACGTCACG-5' 
Sequence ID NO: 7 

&£enL ID NO: 8 MM2aBlA ^ e2 ^ 

SBC(VI) 3'-C2amA2sT2sTC2sTC2*ra^ 



MELTING TEMPERATURE OF HYBRiDS 



ODN(V) 
ODN(VI) 
SBC(V) 
SBC(VT) 



ODN(V) 


ODNCVI) 


SBC(V) 


SBCfVI) 


55 C C 


55° C. 


65* C 


64° C 


64° a 


65° a 


26° C 


26° C 



melting temperature of 20.2° C. Neverthless, each member 
ot this pair forms a stable hybrid with its natural 
complement, in Hybrids II and III. 

PAGE analysis also showed that the two members of the 55 
matched pair of SBC 28-mers do not hybridize in a stable 
manner, and that each SBC ODN and its natural complement 
form a stable hybrid. Moreover, the normal Watson strand 
showed no preference for the normal Crick strand over the 
SBC Cnck strand because when equimolar amounts of these 60 
three strands were mixed simultaneously at room tempera- 
ture about equal amounts of the duplex Hybrids I and HI 
were formed. Additionally, there was little, if any, strand 
displacement or strand exchange when pre-formed Hybrid 
III was incubated with the normal homolog of the SBC 65 

^ff 6 H ^ n ' d IL 71,656 data dcm °nstrate that the 
SBC ODNs behave like natural ODNs when hybridized with 



As it can be seen, the ODNs fully modified with the 
preferred A* and T modifications of the present invention 
exhibit even stronger binding to the natural complementary 
ODNs than the binding between two natural complementary 
strands. At the same time, the matched pair of SBC ODNs 
are nevertheless incapable of forming a stable hybrid with 
each other (their melting temperature is 26° C 

Additional experiments conducted in accordance with the 
present invention, in terms of melting temperature measure- 
ments and PAGE analysis, showed that a matched pair of 
SBC ODNs complementary to both strands of a target 
sequence of double stranded DNA is capable of invading the 
natural duplex nucleic acid to give a stable 3-armed joint. 
Analogous paired normal DNA ODNs failed to invade the 
same target. In case of long double stranded DNA, sequen- 
tially hybridizing the paired SBC ODNs to each member of 
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was kept at 80-90° C. for 5 hours and then evaporated to a evaporated to dryness and chromatographed following the 
syrup. The syrup was dissolved in dichloromthane (10 mL) procedure in Example 8. Two products were collected and 
and added to ice cold methanolic ammonia (100 mL) in a were each separately treated with cone, ammonium hydrox- 
glass pressure bottle. After two days at RT the contents of the ide (1 mL) for 18 hours at 55° C. UV and HPLC analysis 
bottle were evaporated to dryness. The residue was dis- 5 indicated that both products were identical after ammonia 
solved in methanol and adjusted to pH 8 with freshly treatment and were pooled and lyophilized to give 35.2 mg 
prepared sodium methoxide to complete the deprotection. of nucleoside triphosphate. 
After stirring overnight the solution was treated with 

Dowex*-50 H+ resin, filtered and evaporated to dryness. Example 10 

The residue was chromatographed on silica gel using 10 Nick-Translation Reaction 

acetone/hexane (3/2) as eluent to give 2.0 g (77%) of The triphosphate of Example 9 was incorporated into 
analytically pure product. pHPV-16 using the nick translation protocol of Langer et al. 

(supra). The probe prepared with the triphosphate of 
i /-> n on Exam P Ie 7 . x Example 9 was compared with probe prepared using com- 

H2-Deo^-p.D^rythropentof^aDosyl>3-[5-(tntylamino). 35 mer cially available bio-ll-dUTP (Sigma Cfcemical Co). No 
pentyl]pyrazolo[3,4-d]pyrimidin-4-amine significant differences could be observed in both a filter 
5 -monophosphate hybridization and in in situ smears. 

To an ice cold ^solution of the pyrazolopynmidin-4-amine More specifically, the procedure involved the following 
of Example 6 (250 mg, 0.43 mmole) in trimethyl phosphate materials and steps 
(5 mL) was added phosphoryl chloride (50 /*L) and the 2Q m t " Is* 

solution was kept at 0-4° C. The reaction was monitored by nS"L Vir*w n • a • i \ a , * 

reversed phase HPLC using a linear gradient from 0 to 100% ^^^^^^1 ^ it, t 

acetonitrile in water over 25 min. After stirring for 5 hours, D u^&T f i *™ hcmic ? s )~f U/mL 

an additional aliquot of phosphoryl chloride (25 M L) was pHPV-16-2.16 mg/mL which is a plasmid containing 

added and the solution was stirred another 30 min The ^ tte genomic sequence of human papillomavirus type 

solution was poured into 0.1M ammonium bicarbonate and iny ; p * w T • » - „- n mT v n „ . T x 

kept in the cold overnight. The solution was then extracted Tu^^^??^^??** ( * 

with ether and the aqueous layer evaporated to dryness. The M , VT 2 Yr ^W 17 ^) J _ _ 

residue was dissolved in water (5 mL) andVurified by N ^ ^ ^u^^',™ 

reversed phase HPLC using a 22 mmxSD cm CIS column. 30 --^E? *f* V * DcTP ' 

The column was equilibrated in water and eluted with a 3 ° ^1^™, W ^ 
gradient of 0 to 100% acetonitrile over 20 min. Fractions Bio-12^1APPTP— 1.0 mg/mL 

containing the desired material were pooled and lyophilized Steps: 

to give 160 mg (56%) of chromatographically pure nucle- To an icc cold mix ture of lOx-DP (4 mL), pH V-16 (2 mL), 
otide. nucleotide mix A (6 mL), Bio-12-dAPPTP (2 mL), and H 2 0 

35 (20 mL) was added DNase (1 mL) and DNA polymerase 1 
Example 8 (2.4 mL). The reaction mixture was incubated at 16° C. for 

l-(2-Deoxy-P-D-erythropentofuranosyl)-3-{5-[(6- 1 hour. The procedure was repeated using Bio-ll-dUTP and 
biotinamido)hexanamido]pentyl}pyrazolo[3,4-d]pyrimidin- nucleotide mix U in place of Bio-12-dAPPTP (comprising 
4-amine 5'-monopbosphate. the triphosphate of Example 9) and nucleotide mix A. 

An ethanol solution (10 mL) of the nucleotide of Example 40 Nucleic acid was isolated by ethanol precipitation and 
7, palladium hydroxide on carbon (50 mg), and cyclohexa- hybridized to pHPV-16 slotted onto nitrocellulose. The 
diene (1 mL) was refluxed for 3 days, filtered, and evapo- hybridized biotinylated probe was visualized by a 
rated to dryness. The residue was washed with streptavidin-alkaline phosphatase conjugate with BCIP/ 
dichloromethane, dissolved in DMF (15 mL) containing NBT substrate. Probe prepared using either biotinylated 
triethylamine (100 mL), and treated with 45 nucleotide gave identical signals. The probes were also 
N-hydroxysuccinimidyl biotinylaminocaproate (50 mg). tested in an in situ format on cervial smears and showed no 
After stirring overnight an additional amount of qualitative differences in signal and background. 
N-hydroxysuccinimidyl 6-biotinamidocaproate (50 mg) was 

added and the solution was stirred for 18 hours. The reaction Example 11 

mixture was evaporated to dryness and chromatographed 50 5-Amino-3-[(5-tritylamino)pentyl]pyrazole-4^arboxamide. 
following the procedure in Example 7. Fractions were Following the procedure of Example 2, except that 
pooled and lyophilized to give 80 mg of chromatographi- cyanoacetamide is used instead of malononitrile, 
cally pure biotinamido-substituted nucleotide. 5-(tritylamino)pentylhydroxymethylececyanoacetamide is 

prepared from 6-(tritylamino)caproic acid. This is then 
wo ^ _ _ Example 9 ' , 55 treated with diazomethane to give the methoxy derivative, 

l : (2.Deoxy-p.D-erythropentofuranosyl)-3.[5-(6. flowing the procedures of Example 3, which is then 
biotinamidoJ-hexanamidopentylbyrazoloCS^-dlpyrimidin- reacted with hydrazine monohydrate, as in Example 4, to 
4-amine5.tnphosphate ivc 5 -amino-3-[(5-tritylamino)-pentyl]pyrazole-4- 

The monophosphate of Example 8 (80 mg,ca. 0.1 mmole) carboxamide. 
was dissolved in DMF with the addition of triethylamine (14 60 

^L). Carbonyldiimidazole (81 mg, 0.5 mmole) was added Example 12 

and the solution stirred at RT for 18 hours. The solution was 4-Hydroxy-6-methylthio-3-[(5-tritylamino)pentyl]pyrazolo- 

treated with methanol (40 ^L), and after stirring for 30 [3,4-d]pyrimidine. 

minutes tributylammonium pyrophosphate (0.5 g in 05 mL The carboxamide from Example 11 is reacted with potas- 
DMF) was added. After stirring for 24 hours another aliquot 65 sium ethyl xanthate and ethanol at an elevated temperature 
of tributylammomum pyrophosphate was added and the to give the potassium salt of 4-hydroxypyrazolo[3,4«i] 
solution was stirred overnight. The reaction mixture was pyrimidine-6-thiol. This salt is then reacted with 
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oligonucleotides came off at 10 minutes; amino derivatives (triphenylphosphine)palladium(O) (058 g, 0 5 mmol) is 
took 11-12 minutes. The desired oligonucleotide was col- dried in vacuo at 60° for 3 hours and placed under argon A 
lected and evaporated to dryness, then it was redissolved in suspension of the mixture in dry DMF (20 ml) is stirred 
80% aqueous acetic acid for 90 minutes to remove the trityl under argon and treated with dry triethylamine (1 7 mL 12 
group. Desalting was accomplished with a G25 Sephadex 5 mmol) followed by 3<2-Trifluoroacetamidoethoxy)propyne 
column and appropriate fractions were taken. The fractions (3.17 g, 16 mmol). The mixture is cooled at room tempera- 
were concentrated, brought to a specific volume, dilution ture in a water bath and stirred for 17 hours. The mixture is 
reading taken to ascertain overall yield and an analytical treated with 2% acetic acid (100 ml), the catalyst is removed 
HPLC done to assure punty. oligonucleotides were frozen at by filtration and washed with 50% methanol. The filtrates 
-20 C until use. 10 are combined and passed onto a LiChroprep RP-18 column 

In general, to add the crosshnking arm to an (5x25 cm), the column is washed, then eluted with 1% acetic 

ammoalkylohgonucleotide, a solution of 10 ftg of the ami- acid in 50% (v/v) methanol. The fractions with the main 

noalkylohgonucleotide and a 100X molar excess of product are combined, evaporated, and dried in vacuo The 

n-hyoroxysuccinimide haloacylate such as a-haloacetate or resultant foam is stirred with 150 ml of ether to give 

4-halobutyrate in 10 of 0.1M borate buffer, pH 8.5, is is crystalline product; yield 3.6 g (85%); m p 145-1520 

incubated at ambient temperature for 30 min. in the dark. 5^3-(2-Trifluoroacetamidocmoxy)propyll2^eoxyuridine 

The enure reaction is passed over a NAP-10 column equili- A solution of 5-[3-(2-trifluoroacetamidoethoxy) 

brated with and eluted with distilled water. Appropriate propynyl]-2'-deoxyuridine (3.4 g, 8.1 mmol) in methanol 

fractions based on UV absorb ance are combined and the (20 ml) is stirred with ammonium formate (prepared by 

concentration is determined spectrophotometricaUy. 20 addition of 3 ml, 79 mmol of cold 98% formic acid into 2 

23 5,6-Tetrafiuorophenyl trifluoroacetate. ml, 50 mmol of dry ice frozen 25% ammonia) and 0 2 g of 

A mixture of 23^,6-tetrafluorophenol (55.2 g, 0.33 mol), 10% Pd/C for 7 hours at room temperature under hydrogen 

trrfuoroacetic anhydride (60 mL, 0.42 mol) and boron atmosphere. The catalyst is removed by filtration, the filtrate 

tnfluonde etherate (0.5 mL) was refluxed for 16 hr. Trifluo- evaporated and product is purified on LiChroprep RP-18 

roac^tic anhydride and trifluoroaceuc acid were removed by 25 column by the above procedure. Fractions containing the 

distillation at atmospheric pressure. The trifluoroacetic desired product are combined and evaporated to dryness in 

anhydride fraction (bp 40° C.) was returned to the reaction vacuo and the resultant solid is triturated with dry ether to 

mixture along with 0.5 mL of boron trifluoride etherate, and give 3.0 g (87% product, m.p. 107-110°; max in nm in 0 1M 

the mixture was refluxed for 24 hr. This process was triethylamine-aceUte (pH 7.5), 220, 268. Analysis calcu- 

repeated two times to ensure complete reaction. After dis- 30 lated for C 16 H 22 F 3 N 3 0 7 : C, 45.18; H, 5.21; N, 9.88* F 

filiation at atmospheric pressure, the desired product was 13.40. Found C, 45.16; H, 5.16; N, 9.68" F 13 13 

collected at 62° C/45 mm (45° C/18 mm) as a colorless Introduction of chlorambucil residue into the' primary amino 

liquid: yield-813 g (93%); d«1.52 g/mL; n^M.3747; IR groups of oligonucleotides 

(CHCI3) 3010 1815, 1525, 1485, 1235, 1180, 1110, and 955 Preparation of the cetyltrimethylammonium salt of oligo- 

cm Anal Calcd for C 8 HF 7 0 2 : C, 36.66; H, 0.38; F, 50.74. 35 nucleotides: a 100 aliquot of aqueous solution of olko- 

?T< « r , a }' f ' u 3; u L v , nucleotide (50-500 ug), generally triethylammonium salt, 

u l ^ Tetraflu / < ^°P hcn y 1 - 4 -D>is(2-chloroetbyl)amino] was injected to a column packed with Dowex 50wx8 in the 

phenylbutyrate (Chlorambucil 2,3,5,6-tetrafluorophenyl cetyltrimethylammonium form and prewashed with 50% 

CSt T i*- r^<^ „™ , alcohol in water. The column was eluted by 50% aqueous 

To a solution of 0.25 g (0.82 mmol) of chlorambucil 40 ethanol (0.1 mL/min). Oligonucleotide containing fraction 

(supplied by Fluka A. G.) and 03 g (1.1 mmol) of 234,6- was dried on a Speedvac over 2 hours and used in following 

tetrafluoropbenyl trifluoroacetate in 5 ml of dry dichlo- reactions. 

romethane was added 0.2 Ml of dry triethylamine. The Ethanol solution (50 uL) of cetyltrimethylammonium salt 

mixture was stirred under argon at room temperature for 0.5 of an oligonucleotide (50-100 ^g) was mixed with 0 08M 

h and evaporated. The residual oil was purified by column 45 solution of 2^^,64etrafluorophenyM'-[bis(2^hloroethyn 

chromatography on silica gel with hexane-chloroform (2:1) amino]phenyIbutyrate (tetrafluoropbenyl ester of 

™ ^ g ^ t0 , g i™ th \ CSter * an ° fl: 028 6 chlorambucil) in acetonitrile (50 ^L) and 3 fl of diisopro- 

C Z^%F\^ Ci A R/X6; IR (b CHCl3) Py lcth y la ^ After shaking for three hours at room 

»> 13 , ' ,? „ cm ~ * temperature, the product was precipitated by 2% LiQO. in 

2-Propargy^ 50 acctonc (15 mL) ^ produc , was reprccipitatcd f^ m 

1 , ' WaS P rc P ared bv condensing propynol water (60 uL) by 2% UQ0 4 in acetone three times. Finally 
with 2-bromoetJiylammonlum bromide in liquid ammonia in the chlorambucil derivative of the oligonucleotide was puri- 
tne presence of Na NH 2 , and was used crude for the next fied by Reverse Phase Chromatography with approximately 



5f/!r?"i ^ , , 50-80% yield. The fraction containing the product was 

3K2.Trifluoroacetamidoethoxy)propyne 55 concentrated by addition of butanol. The isolated chloram- 

(2-Propargyloxyethyl)amine (13.8 g, 0.14 mol) is stirred bucil derivative of the oligonucleotide was precipitated in 
and chilled in an iso-propanoWry ice bath while excess of acetone solution with LiC10<, washed by acetone and dried 
trifluoroaceuc anhydride (26 ml, 0.18 mol) is added drop- under vacuum. All manipulations of reactive oligonucleotide 
"?f; ^-^P^y^^yO^fl^roacetamide is distilled were performed as quickly as possible, with the product in 
at 84-S5 /1.7 ton as an oil which solidified upon refrigera- 60 ice-cold solution 
uon; yield 14.4 g (52%), m.p. (160, 1.4110. Anal. Preparation of SBC ODNs 

Si^ r'mff^M^ N-phenoxyacetyl protected 2'-deoxyguanosine and 

IT A Ft t ? ; 4 ° 3; N '.]*° 6; F ' 2938 * 2'-deoxycytidine 3'.0-2-cyanoethyl-N,N'- 

l' ^-^-influoroacetamidoethoxyJ.Propynyl]^'. diisopropylphosphoramidite are available commercially 
aeoxyundme 65 from BioGenex, Alameda, California. 5 , -0-dimethoxytrityl- 

Amixtoreof5-iodo.2-deoxyundine(3.54g,10mmol), 2-tbiothy midine^.O-^-cyanoethyl-NV- 
copperU) iodide (0.19 g, 1 mmol) and tetrakis diisopropylphosphoramidite) was prepared using the proce- 



SEQUENCE LISTING 



(1) GENERAL INFORMATION : 

(Ui) NUMBER OF SEQUENCES i 8 

(2) INFORMATION FOR SEQ ID NOUx 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURES 

(A) NAME/KEY: misc.f eature 

(B) LOCATION: 1..28 

(D) OTHER INFORMATION: /note- "correspond*, to Watson- 
strand of Hybrids I t m- 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

CTGACAACGA TCGGAGGACC GAAGGAGC 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1..28 

(©) OTHER INFORMATION: /note- -corresponds to "Cricx* 
strand of Hybrids I i n» 

(Xi) SEQUENCE DESCRIPTION : SEQ ID N0:2: 

GCTCCTTCGG TCCTCCGATC GTTGTCAG 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/ KEY: modified_base 

(B) LOCATION: one-of(l, 5, 8, 12, 19 20 2(U 

(D) OTHER INFORMATION: /modjUse- OT^ [ ' ' 

/note- -pyrrolo-[2,3-d]pyrimidijie-2<3H)-one- 
(ix) FEATURE: 

(A) NAME/ KEY: modified_base 

(B) LOCATION: one-off 3. 9 1 1 u ,r « 

/note- "hypoxanthine" 
(ix) FEATURE: 

(A) NAME/KEY: misc.f eature 

(B) LOCATION: 1..28 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
NTNANAANNA TNNNANNANN KAANNANN 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 



39 



5,912340 

-continued 



40 



(i) SEQUENCE CHARACTERISTICS 1 

(A) LENGTHS 20 base pairs 

(B) TYPEt nucleic acid 

(C) STRANDEDNESSt single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : modi£ed_boee 

(B> LOCATION i one-of(3, 8, 10, IX, 19) 

(D) OTHER INFORMATION : /mod_base- OTHER 

/note- *d2amAdenine replaces all dAdenine" 

(ix) FEATURE: 

(A) NAME /KEY: modified_base 

(B) LOCATION: one-of(5, 9, 12, 13, 15, 17, 18) 
(D) OTHER INFORMATION : /mod_base- OTHER 

/note- *d2sThymine replaces all dThymine" 

(xi) SEQUENCE DESCRIPTION I SEQ ID NO: 8: 
GCACTGCATA ATTCTCTTAC 20 




What is claimed is: 

1. A pair of oligonucleotides (ODNs), each of said ODNs 
comprising nucleotide moieties having naturally occurring 25 
aglyoon bases and a combination of modified aglycon bases 
selected from the group consisting of the combinations (1) 
A', T, (2) G\ C and (3) A\ T, G\ C, the duplex form of said 
pair of ODNs having a melting temperature under physi- 3Q 
ological conditions of less than approximately 40° C, each 
of said pair of ODNs being substantially complementary in 
the Watson-Crick sense to one of the two strands of a 
duplexed target sequence in nucleic acid, 

35 



wherein the nucleotide moieties having the modified 
bases have the following properties: 
within complementary oligonucleotides A' does not 40 
form a stable hydrogen bonded base pair with T and 

forms a stable hydrogen bonded base pair with T; nh 2 
within complementary oligonucleotides T does not 
form a stable hydrogen bonded base pair with A' and 45 
forms a stable hydrogen bonded base pair with A; 

within complementary oligonucleotides G' does not 
form a stable hydrogen bonded base pair with C and 
forms a stable hydrogen bonded base pair with C, 
and 50 

within complementary oligonucleotides C does not 
form a stable hydrogen bonded base pair with GI and 
forms a stable hydrogen bonded base pair with G. 
2. The ODNs of claim 1 wherein the nucleotide moiety A* 55 
has the structure selected from the groups shown by formu- 
las (i), (ii) and (iii) 



C 




60 



65 
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O R 



Y = p — z 
I 

O 

I 

wherein 
Y is O or S; 
Z is OH or CH 3 ; 

R is H, F, or 0R 2 , where R 2 is H, C U6 alkyl or 
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R 4 is H, alkyl, alkenyl, C^ 6 alknyl, a cross- 
(vui) linking function or a reporter group; 

Zj is O or NH, and 
R 3 is H, or alkyl. 
5 6. The ODNs of claim 2 wherein the nucleotide moiety A' 
has the structure in accordance with formula (i). 

7. The ODNs of claim 6 wherein X is N, Z is OH, and Y 
isO. 

8. The ODNs of claim 7 wherein R a is NH 2 . 

j 0 9. The ODNs of claim 3 wherein Z is OH, and Y is O. 

10. The ODNs of claim 9 wherein R 4 is CH 3 . 

11. The ODNs of claim 4 wherein the nucleotide moiety 
G* has the structure in accordance with formula (v). 

12. The ODNs of claim 11 wherein X is N, Z is OH, and 

is Yis0 - 

13. The ODNs of claim 12 wherein R a is H. 

14. The ODNs of claim 5 wherein the nucleotide moiety 
C has the structure in accordance with formula (viii). 

15. The ODNs of claim 14 wherein, Z is OH, Z 2 is NH and 
Yis O. 

20 16, The ODNs of claim 15 wherein R 5 is H. 
1 17. The ODNs of claim 1 having approximately 5 to 99 

nucleotide units. 

18. The ODNs of claim 1 wherein each of the nucleotides 
is a 2'-deoxyribonucleotide. 
25 19. The ODNs of claim 1 wherein each of the nucleotides 
is a ribonucleotide. 

20. The ODNs of claim 1 comprising at least one nucle- 
otide unit having a 2-O-methylribose moiety. 

21. The ODNs of claim 1 comprising a cross-linking 
30 function covalently attached to at least one nucleotide unit. 

22. The ODNs of claim 1 comprising a reporter group. 

23. The ODNs of claim 1 wherein the combination of 
modified aglycon bases is A\ T. 

24. The ODNs of claim 1 wherein the combination of 
35 modified aglycon bases is G\ C. 

25. The ODNs of claim 1 wherein the combination of 
modified aglycon bases is A*, T, G\ C. 

26. The ODNs of claim 1 where the pair of oligonucle- 
otides are linked to one another by a covalently bonded 

40 tether. 

1. * * * * * 
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ANALOGS OF GUANINE NUCLEOSIDE 
TRIPHOSPHATES FOR SEQUENCING 
APPLICATIONS 

Mark G. McDougaH, Lei Sun, Inna Livshin, Louis P. Hosta, 
Bernard F. McArdle, Sui-Bi Satnols, Carl W. Fuller, 
and Shiv Kumar* 

Amersham Pharmacia Biotech, 800 Centennial Ave., Piscataway 

New Jersey 08855 

ABSTRACT 

We have synthesized more than 30 different deoxyribonucleosides and triohos 

s?. dcrpr dn^t either in ,he h r ° r the ph ° sphate ™^ ^ 

' h 0 r ified nucieoside 

Seauen-KrT\i T7nMA , ^"siiates tor UNA polymerases, inc udins 
^equenase T7 DNA polymerase or Thermo Sequenase™ DNA polymerase 
Two of the analogs, 7-ethyl-7-dea Z a-dGTP and 7-hydroxymeth > °^deT 2a 
dGTP meet our requ.rements as better sequencing reagents. 

One of the most common difficulties in DNA cm,,*™- ■ , 
sis (1). Fragments with these secondary structures migrate fa Jr r ? 

and 7-deaza-2 -deox yg uanosine-5'-tnphosphate (2) (dZTP, sZtoZoZ^ 
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for sequencing. Both these analogs have specific problems, which prompted us to 
undertake this line of research. 

Although dITP has excellent compression resolving properties, eliminating 
essentially all compressions, the reactivity with both Sequenase™ T7 DNA poly- 
merase and Thermo Sequenase™ DNA polymerase is low ( 1 0 to 20% that of dGTP), 
and uniformity of terminations in the sequencing reactions exhibit sequence specific 
variations, which can be severe. This makes sequence interpretation more difficult 
and certain sequencing applications impossible. 7-Deaza-2'-deoxyguanosine-5'- 
triphosphate has good reactivity with DNA polymerases and reasonable uniformity. 
However, it resolves only 75-80% of all compressions in slab gels and less in cap- 
illary electrophoresis where the denaturing conditions are weaker. Therefore, the 
goal of this research was to find an analog with better compression resolution than 
dZTP and better reactivity and uniformity than dITP. 

In a research program directed towards finding better substrates for the DNA 
polymerase than dITP and better compression resolution properties than 7-deaza- 
dGTP, we have prepared a number of 2'-deoxynucleoside-5'-triphosphates as novel 
analogs of the guanine bases (Fig. 1). Most of the nucleosides were prepared by 
using literature procedures with minor modifications and the triphosphates were 
prepared using P(0)Cl3 and bis(tri-/z-butylammonium)pyrophosphate in a "one- 
pot, three-step" phosphorylation procedure (3). The compounds 6-thio dITP (2) 
and 6-dGTP (3) were prepared as previously reported (4). The nucleosides 2- 
bromo- and 2-fluoro-2'-deoxyinosine were synthesized according to the methods 
developed by Robins (5). 5-Aza-7-deaza-2'-deoxyguanosine (6) and 3,7-dideaza- 
2'-deoxyguanosine (7) were synthesized using minor modifications of published 
procedures. 7-Halogenated-7-deaza dGTP analogs have been previously reported 
(8). The alkynyl series of 7-deaza-2'-deoxyguanosine was prepared according to the 
literature methods (9,10). The synthesis of 7-alkyl- and 7-hydroxyalkyl-7-deaza-2' 
deoxyguanosine as well as 7-substituted-7-deaza-2'-deoxyinosine nucleosides will 
be reported elsewhere (11). All these nucleoside triphosphates were tested for activ- 
ity as substrates for DNA polymerase. The analog triphosphates that proved active 
were also tested for sequencing on templates with known compression artifacts and 
scored relative to dGTP, dZTP and dITP. Specifically, these analogs were used as 
substrates replacing dGTP in dye primer cycle sequencing reactions with Thermo 
Sequenase™ on double stranded pGEM 3Zf(+), starting at the —28 reverse primer. 
The ratio of an analog to ddGTP was varied in the sequencing reaction mixtures, 
which were the same as typically used in standard primer labeled sequencing reac- 
tions for dGTP or dZTP ( 1 2). The results of these experiments are given in Table 1 . 

Three characteristics for each analog triphosphate are recorded in Table 1. 
Relative reactivity is a rough assessment of the compound as substrates for the 
exonuclease free Thermo Sequenase™ DNA polymerase. These values were deter- 
mined from the optimal sequencing reaction ratios, reflecting competition between 
the analog and ddGTP. The comment word "stops" indicate the inability of the 
polymerase to extend past sites of multiple analog incorporation. Band uniformity 
is a term used for changes in the sequence specific incorporation of ddGTP, as 
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H0 9 P 3 O 




dGTP (1) X = N,y = NH„2 = 0 
6-thiodJTP (2) X = N,Y = H Z = S 
6-thio-dGTP (3) X = N.Y = NH,.Z=S 
2-bromo-dITP (4) X = N. Y = Br, 2 = O 

2-nuono-dnr (5) x = n. y = f. z = o 

7-deaza-dGTP (8) X = CH, Y = NH % 2-0 
dTTP (22) X = N, Y = H, 2 = 0 
-propynyl-7-deaza-dITP (23) X = C-CsCCH, Y = H 2-0 
7-pro P> !-7-deaz a -dITP (24) X = C-CH,CH,CH 3 Y = H Z-0 



O 



HO9P3O 




5-aza-7-deaza dGTP (6) 



H0 9 P 3 o 




N NH, 



H0 s P 3 O 




R = Cl (9), I (10). CH 3 (11). CH,CH 3 (12), CH.CH.CH, (13) 
CH(CH 3 ) : (14), (CH : ) 5 CH 3 (15), C^CH (16 )C^(CH) CH tm 
CH : OH (18),CH : CH : OH ( 3 19). CH.CH^OH £fc C^H 

Figure L 



3.7-dideaza dGTP (7) 



observed by variations in band intensity. The values reported are the variance of 
band mtensiues normalized by an algorithm. Absolute uniformity is scored as zero 

,, -, TlK nu< ;' eoslde triphosphates with modifications to the hydrogen bondm- face 

applica,ions ' A " ,hL • ^ 

(«) or by ,on,zation (5) led to inactivity. The nucleotide 2 F rtlTP « r J 8 
have a pKa value of 5 .2 as determinedly Uvlt^XZ^ 

PH ».5. At this pH, extens,on w,th this analog is almost non-existent, however 
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Table L Results of Primer Labeled Sequencing Using Thermo Sequenase™ 


Polymerase 


Compound 


Relative 


Band 




Reactivity 


Uniformity 1 


JLUJ C 


l.dGTP 


1 


U. 1 J 


0 


2. 6S-dITP 


0 KstODS) 






3. 6S-dGTP 


0 9(StODS) 


u.oo 


0 


4. 2Br-dITP 


0 lfStODS^ 






5. 2F-dITP 


0 1 to 0 ? fat dH — 9 S tn 7 






6. 5aza-7deaza-dGTP 


o 






7. 3,7 dideaza-dGTP 


o 






8. 7-Deaza-dGTP(dZTP) 


1 


U. 1 y 


1 


9. 7-Chloro-dZTP 


1 




-I 


10. 7-lodo-dZTP 


1 1 




-1 


11. 7-Methvl-dZTP 


1 


U. 1 O 


2 


12. 7-Ethvl-dZTP 


i 
i 


n i o 
u. iy 


3 


13. 7-Propyl-dZTP 


0.85 


0.21 


3.5 


14. 7-Isopropyl-dZTP 


0.2(stops) 




4 


15. 7-Hexyl-dZTP 


0.1 (stops) 




4 


16. 7-Propynyl-dZTP 


1.25 


0.21 


0 


17. 7-HexynyI-dZTP 


0.6(stops) 




1 


18. 7-Hydroxymethyl-dZTP 


1 


0.16 


1.5 


19. 7-Hydroxyethyl-dZTP 


0.2(stops) 






20. 7-Hydroxypropyl-dZTP 


0.6 


0.64 


3.5 


21. 7-HydroxypropynyI-dZTP 


0.5 


0.28 


0 


22. dITP 


0.2 


0.33 


4 


23. 7-PropyI-7-deaza dITP 


0.4 


0.25 


4 


24. 7-Propynyl-7-deaza dITP 


0.1 (stops) 


0.3 


4 



'Lower numbers more uniform heights. 
2 Fewer compressions, higher score. 



somewhat better activity is observed upon lowering the pH to 7.5. A related pH 
phenomenon is observed in the thermodynamic properties of duplexes containing 
2-fluoro-2'-deoxyinosine (13). 

Compound 7 is an example of the importance of the nitrogen at the N-3 posi- 
tion for polymerase substrate recognition (14). Although, 6-thio dGTP (3) exhibits 
good activity under our sequencing conditions, stops or pauses were observed at 
multiple runs of G incorporation. Such behavior renders this analog unsuitable as 
a replacement for dGTP in most sequencing applications (4). 

Nucleotide derivatives of 7-deaza-2'-deoxyguanosine triphosphate with sub- 
stitution at the 7-position (9-21) yield an exciting class of polymerase substrates. 
7-Halogenated (8) (9-10) as well as 7-propynyl dZTP (16) are excellent substrates, 
which easily sequence out to 500 bases with intermediate band uniformity. Using 
these analogs unfortunately made compression artifacts as bad or worse than dGTP. 
It is likely that the structural properties that impart duplex stability to double 
stranded DNA containing these bases (9,15), are also responsible for stabilizing 
the structures which result in compression artifacts. Increasing the size of the sub- 
stituent on the triple bond as in compounds 17 and 21 lowered substrate activity. 



GUANINE NUCLEOSIDE TRIPHOSPHATES 



505 



with 7-hexynyl dZTP (17) exhibiting abortive behavior at multiple G incorporation 
sites. 

Alkyl derivatives of 7-deaza-2'-deoxyguanosine triphosphate (11-15) display 
a trend where increasing size gives increasing compression resolution but decreas- 
ing polymerase reactivity. 7-Methyl (11) and 7-ethyl dZTP (12) have a relative 
reactivity that matches the parent compound (8). 7-Propyl dZTP (13) shows less 
reactivity and more background noise in primer sequencing reactions. However, the 
ability to resolve compression artifacts is in the order: propyl > ethyl > methyl > 
hydrogen. The increase in steric bulk at the 7-position, such as with isopropyl (14) 
or chain length as with hexyl (15) diminishes the reactivity greatly and elongation 
at specific sites is terminated. 

The hydroxyalkyl series (18-20) has very interesting polymerase behavior. 
7-Hydroxymethyl-7-deaza-2'-deoxyguanosine-5'-triphosphate (18) with reactivity 
and uniformity as good as 2'-dGTP (1) and a better compression score than dZTP (8) 
was by far the best sequencing analog tested. Under difficult template conditions, 
this analog out-performed any other analog tested in this study. Extending the 
alkyl chain by one methylene group, to hydroxyethyl (19), led to poor reactivity 
and polymerase stops. Hydroxypropyl dZTP (20) did not terminate polymerase 
extentions but has large uniformity problems. We believe that the hydrogen bonding 
capability in these analogs is responsible for their polymerase behavior and more 
studies are underway to understand this class of molecules. 

Two compounds (23-24) were prepared as analogs of 7-substituted-7-deaza- 
dlTP Initial results indicate that the 7-propynyl derivative (23) is as good or better 
a substrate as 2'-dITP for Thermo Sequenase™ DNA polymerase. The propy- 
nyl substitution also helps band uniformity while retaining an excellent com- 
pression score. Further testing of this analog will be done to evaluate its utility 
for routine sequencing with both dyelabeled primers and dye-labeled termina- 
tors in slab gels and capillary electrophoresis. The alkyl addition to 7-deaza-7- 
dlTP, namely 7-Propyl-7-deaza-dITP (24) worsened reactivity of an already poor 
substrate. 

In conclusion, we have synthesized a number of analogs of 2 ; -deoxyguanosine 
triphosphate. Our results show that 7-ethyl, 7-propyl, and 7-hydroxymethyl-7- 
deaza-2'-deoxyguanosine triphosphates are good replacements for 2'-dGTP in se- 
quencing reactions, eliminating compression artifacts to a greater degree than the 
parent compound, 7-deaza-2'-deoxyguanosine triphosphate. We are now in the pro- 
cess of determining why certain analogs relieve compression artifacts better than 
others do and examining other biological activities for our novel nucleosides and 
nucleotides. 
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ABSTRACT 

Pairs of high density oligonucleotide arrays (DNA 
ch ps) consisting of >96 000 oligonucleotides were 
designed to serein the entire 5.53 kb coding region of 
the hereditary breast and ovarian cancer 3RCA1 gene 
for all possible sequence changes in the homozygous 
and heterozygous states. Single-stranded A targets 
Se generated by PCR amplification of individual 
BRCA1 exons using primers containing T3 and T7 
RNA polymerase promoter tails followed by m vitro 
Zsfton and' partial fragmentatior ' reactions^ 
Fluorescent hybridization signals from targets 
containing the four natural bases to >5592 different 
LTcSementary 25mer oligonucleotide probes on 
the chip varied over two orders of magnitude. To 
examinethe thermodynamic contributionofrU^a^ 
rA-dT target-probe base pairs to this variability, modrfled 
uridine [5-methyluridine and 5-{1-propynyl)-ur.d.ne) 

and modified adenosine ^^ U " R ^° S ^ 
5-triphosphates were incorporated into f«pAnargete 
Hybridization specificity was assessed based upon 
hybridization signals from >33 200 probes ^containing 
SXlylocalized single base pair mismatches relative 
to Set sequence. Targets containing 5Hnethylur,d.ne 
displayed promising localized enhancements in 
hybridLtioK signal, especially in mnmrtnmh 
target tracts, while maintaining single nucleotide 
mismatch hybridization specificities comparable with 
those of unmodified targets. 

INTRODUCTION 

Light-directed combinatorial chemical approaches dlow the 
manufacture of high density arrays consisting of >10> distinct 
oligonucleotide species (20 urn feature size) on 1.2 x 1.2 cm 
glass surfaces (1-2). Such arrays have been used to screen for 
mutations and polymorphisms in the CFTR gene (3). the HIV-1 
reverse transcriptase and protease genes (4), the JJ-globin gene 
(<>) the mitochondrial genome (6) and the BRCA1 gene (7). 



Furthermore, they have been used to monitor gene expression (8), 
analyze gene function (9), optimize antisense oligonucleotide 
design (10) and acquire information from orthologous genes in 

related species (11). . 

A significant challenge in high density oligonucleotide : array- 
based applications is to develop assay conditions so all fully 
complementary perfect match oligonucleotide probes of v«j ng 
sequencecontentproduce ro bustandsF«.fictajiethybnd«aUon 

signals. Subsets of perfect match probes could have a gready 
dtamished hybridization signal due to decreased duplex stab. toy 
resulting from sequence composition effects and inter- and 
intramolecular structures in both target and probes. When rel.aMe 
data from such probes must be generated, hybnduauon conditions 
which are suboptimal for the specificity of other prob« , wrth 
robust hybridization signals may have to be employed. Herein, we 
analyze fte affinity and specificity of RNA targets toward >*> 000 
oligonucleotide probes present on a pair of high density 
oligonucleotide arrays that scan the entire coding region of the 
BRCA1 gene for aU possible homozygous and heterozygous 
sequence changes. To pursue sequence composition effects ontarget 
hybridization and to explore possible solutions, we evaluated the 
Sect of incorporating modified nucleoside fP^phates into 
BRCA1 RNA target on hybridization signal and single nucleotide 
mismatch hybridization specificity. 

MATERIALS AND METHODS 
Synthesis of pyrimidine 5 -triphosphates 

The synthesis of 5-(l-propynyl)-uridine was acoompfched in taee 
steps from commercially available 5-iodoundine .fg™* RrsU 
neannent with excess acetic anhydride in pyndme produced 
tri-O-acetyl-5-iodouridine. This compound was «jnverted to the 
Spropynyl analog using the method of Hobbs (12) Tte free 
«S was then generated by reachon^ ^th NaOOJ J 
methanol followed by desalting over BioRadAG-501 mixed bed 

^The 5'-triphosphates of 5-(l-propynyl)-uridine and corranercially 
available 5-methyluridine (R.I. Chemical) were synttiesized 
Sg a two step procedure. Rrst, conversions of the nucleosides 
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Table 1. BRCAl exon amplification primers 



Exon T3 Forward Primer Sequence (5 '-3') 



T7 Reverse Primer Sequence ( 5 ' -3 ' ) 
d ( TAATACGACTTCACTATAGGG ATGTCTTTTCTTCCCTAGT ATGT ) 



2 d ( ATTAACCCTCACTAAAGGGAAGTTGTCATTTTATAAACCTTT ) 

3 d { ATTAACCCTCACTAAAGGGAACGAACTTGAGGCCTTATG ) 

5 d t ATTAACCCTCACTAAAGGGACTCTTAAGGGCAGTTGTGAG ) 

6 d < ATTAACCCTCACTAAAGGGACTTATTTTAGTGTCCTTAAAAGG ) 

7 d ( ATTAACCCTCACTAAAGGGACAACAAAGAGCATACATAGGG I 

8 d ( ATTAACCCTCACTAAAGGGATGTTAGCTG ACTG ATG ATGGT ) 

9 d ( ATTAACCCTCACTAAAGGGACCACAGTAGATGCTCAGTAAATA ) 

10 d ( ATTAACCCTCACTAAAGGGATGGTCAGCTTTCTGTAATCG ) 

11 d ( ATTAACCCTCACTAAAGGGAATTAAATGAAAGAGTATGAGC ) 

12 d( ATTAACCCTCACTAAAGGGAGTCCTGCCAATGAGAAGAAA ) 

13 d ( ATTAACCCTCACTAAAGGGAATGGAAAGCTTCTCAAAGTA ) 

14 d{ ATTAACCCTCACTAAAGGGACTAACCTGAATTATCACTATCA ) 

15 d { ATTAACCCTCACTAAAGGGATGGCTGCCCAGGAAGTATG ) 

16 d ( ATTAACCCTC ACTAAAGGGAATTCTTAACAGAG ACC AGAAC ) 

17 d ( ATTAACCCTCACTAAAGGGAGTGTAGAACGTGCAGGATTG ) 

18 d ( ATTAACCCTCACTAAAGGGAGGCTCTTTAGCTTCTTAGGAC ) 

19 d { ATTAACCCTCACTAAAGGGATCTCCGTGAAAAAGAGC ) 

20 d ( ATTAACCCTCACTAAAGGGATATGACGTGTCTGCTCCAC ) 

21 d ( ATTAACCCTCACTAAAGGGAATCATCAGGTGGTGAACAGAAG) 

22 d (ATTAACCCTC ACTAAAGGGATCCCATTGAGAGGTCTTGCT ) 

23 d ( ATTAACCCTCACTAAAGGGACAGAGCAAGACCCTGTCTC ) 

24 d ( ATTAACCCTCACTAAAGGGATGATTAGAGCCTAGTCCAGG) 



d ( TAATACGACTCACTATAGGGATTGGATTTTCGTTCTCACTTA ) 
d { T AATACGACTCACTATAGGGATTCCTACTGTGGTTGCTTCC ) 
d ( T AATACGACTC ACT AT AGGG A TTTCATGGACATC ACTTG AG TG ) 
d ( TAATACGACTCACTATAGGGAGGGCTAAGGCAGGAGGACTGCT) 
d { TAATACCACTCACTATAGGGATCCAGCAATTATTATTAAATAC ) 
d ( TAATACGACTCACTATAGGGATAGGAAAATACCAGCTTCATAGA ) 
d ( TAATACGACTCACTAT AGGGAGT ATCT ACCC ACTCTCTTCTTCAG > 
d ( T AATACGACTC ACT AT AGGGAGTGCTCCCAAAAGCATAAA ) 
d t TAAT ACGACTCACTATAGGG ATGTCAGCAAACCTAAGAATGT ) 
d ( TAATACG ACTCACTATAGGGATGTTGG AGCTAGGTCCTTAC ) 
d < T AATACGACTC ACT ATAGGGAGTGTAT AAA TGCCTGTATGCA ) 
d ( TAATACGACTC ACTATAGGG AACCAG AATATCTTT ATGTAGGA ) 
d ( TAATACGACTC ACTATAGGG AAAACTCTTTCC A GAATG TTGT ) 
d { TAATACGACTCACTATAGGGATCGCCTCATGTGG i ri l A ) 
d (TAATACGACTC ACTATAGGGAGACCATTTTCCCAGCATC ) 
d ( TAATACGACrCACTATAGGGACATTGTTAAGGAAAGTGGTGC ) 
d (TAATACGACTCACTATAGGGAGGGAATCCAAATTACACAGC ) 
d ( TAATACGACTCACTATAGGGAGTGCTGGAACTCTGGGGTTCT ) 
d ( TAATACG ACTC ACTATAGGG AGAAG ACTTCTGAGGCTAC ) 
d ( TAATACGACTCACTATAGGGACATTTTAGCCATT CA ) 
d < TAATACGACTC ACTATAGGGATAGCCAGAAGTCCTTTTCAGG ) 



into the crude 5'-phosphodichlorodates were accomplished using 
the procedure of Sowa and Ouchi (13). Without isolation, these 
compounds were directly converted into the triphosphates by the 
addition of tributylammonium pyrophosphate and tributylamine 
in DMF (14). The triphosphates were purified by anion-exchange 
chromatography eluting with a gradient of memylammonium 
formate (pH 6.5). Extensive lyophilization and co-evaporation with 
water provided the desired triphosphates as their methylamrnonium 
salts. Identities of both compounds were confirmed by 31 P NMR 
spectroscopy and negative ion FAB-MS and the purities were 
determined to be >95% by analytical anion-exchange HPLC. 

Data for 5-MeUTP: 31 P NMR (202 MHz, D2O) 5 -10.40 (d, J = 
19.7 Hz, P Y ), -1 1 .22 (dd, J= 20.4, 2.5 Hz, P a ), -22.84 (unresolved 
dd, ./apparent = 19-8 Hz, Pp); MS (negative ion FAB) mJz 497 
(100%, [M^ + 3H + ]), 479 (24%, [M 4 " + 3H + -H 2 0]), 331 (82%). 

Data for PUTP: 3, P NMR (202 MHz, D2O) 5 -9.24 (m, P a ), 
-10 69 (d,J= 19.3 Hz, P Y ), -22.27 (unresolved dd, ./apparent = 19- 7 
Hz, Pp); MS (negative ion FAB) mlz 521 (100%, [M 4 " + 3H + ]). 

NMR spectra were obtained on a Briiker AM500 spectrometer. 
MS were obtained on a VG Autospec mass spectrometer. 

Synthesis of diaminopurine 5'-triphosphate (rDTP) 

Karninopuiine-5'-rnonophosphate was synthesized from diamino- 
purine riboside (Reliable BioPharrnaceuticals) using the described 
procedure (15) and purified on DEAE-Sephadex employing a 
0-0.5 M LiCl gradient to give an 80% yield. The title compound 
(rDTP) was then synthesized from the monophosphate (16) in 
80% yield, purified by RP-HPLC in 50 mM TEAA using an 
acetonitrile gradient and characterized by 31 P NMR and MS. 

Data for DTP: 31 P NMR (D2O, relative to H3PO4) 6 -22.4 (t), 
-10.8 (d), -8.6 (broad, m). LCMS = 521.0 (M-H). 

RNA target preparation 

In vitro transcription templates were generated by PCR 
amplification of all BRCAl coding exons from genomic DNA 
using intronic forward and reverse primer pairs containing T3 and 
T7 promoter sequences, respectively (Table 1). Exon 11 was 
amplified using the EXPAND™ Long Range PCR Kit (Boehringer 
Mannheim) (7). The remaining 21 coding exons were amplified 



using the Amplitaq Gold PCR Kit (Perkin Elmer). Approximately 
5 ng of each exon (except exon 1 1) were pooled and subject to T3 
and T7 RNA polymerase in vitro transcription reactions. Exon 1 1 
templates (-50 ng) were transcribed in a separate reaction. In vitro 
transcription reactions were performed in 20 reaction volumes 
using T3 RNA polymerase transcription buffer (Promega), 0.7 mM 
of the appropriate nucleoside triphosphates, 10 mM DTT, 0.7 mM 
biotin-16-CTP (Enzo Diagnostics) and 10 U T3 or T7 RNA 
polymerase as indicated. A 10 ^1 volume of pooled BRCAl exons 
was diluted into a 8 solution of 100 mM MgCl 2 with exon 1 1 
transcription products separately treated in a like manner. 
Reactions were incubated at 94°C for 15 and 45 min, respectively, 
to fragment targets into -50-100 nt long pieces which are more 
accessible to hybridization (7). In theory this produces a relatively 
random distribution of fragmentation products, however, some 
phosphodiester internucleotide linkages may be more reactive to 
hydrolysis than others. This may influence target hybridization in 
some sequence contexts and should be taken into consideration 
when interpreting array hybridization data in this study. 

Array hybridization and data collection 

Fragmented exon 1 1 and pooled exon targets were combined and 
diluted into a 400 |xl volume of hybridization buffer A (3 M 
tetramethylammonium chloride, lx TE, pH 7.4, 0.001% Triton 
X-100) or B (6x SSPE, 0.005% Triton X-100) containing either 

I nM 5'-fluorescein-labeled control oligodeoxyribonucleotides S 
(5^CGGTACK:ATCriTGAC-3') or AS (5'-GTCAAGATGC- 
TACCG-3') (for arrays complementary to sense and anti sense 
strand targets, respectively) (7). Arrays were hybridized with 
target, stained with a phyccerythrin-streptavidin conjugate 
(Molecular Probes) and hybridization signals quantitated as 
previously described (7). 

RESULTS AND DISCUSSION 

Oligonucleotide array design 

Extending previous analysis of the 3.43 kb central BRCAl exon 

I I (7), a pair of arrays consisting of >96 600 oligonucleotides was 
designed to scan both strands of the 5.59 kb BRCAl coding 
sequence (containing 22 coding exons) for all possible sequence 



i 
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Figure 1. Proposed hydrogen bonding schemes of modified pyrimidine and 
purine base pairs. mc U, PU and DAP represent 5-rnethyhiridine, 5<l-propynyl>- 
uridine and 2 ( 6^aminopurine, respectively. Dashed lines represent proposed 
hydrogen bonds. 



changes not involving insertions and deletions greater than the 
probe length. Every BRCA1 nucleotide position is interrogated by 
four 25 nt probes on the chip, each substituted with one of the four 
nucleotides in the central position. The ratio of hybridization 
signal to perfect match probes relative to those of the three single 
nucleotide substitution mismatch probes provides a measurement 
of hybridization specificity under a given set of conditions 
(3,4,6,7). A set of perfect match oligonucleotides (25, 23 and 22 
nt in length) per target strand base form a contig of single 
nucleotide overlapping probes which tile across all BRCA1 
coding exons along with 10 bp of flanking intronic sequences. 
These probes are used in two color hybridization mutational 
analysis experiments (6,7). 

Modified nucleoside triphosphate design 

To directly test the effects of target sequence composition due to 
dA-rU and dT-rA probe-target interactions on the range of 
hybridization signals, we incorporated modified nucleotides into 
BRCA1 targets. Elegant studies have shown that several RNA 
polymerases can tolerate template-directed incorporation of 
non-natural nucleoside triphosphates (17,18). Furthermore, 
modified uridine derivatives have been characterized which 
enhance hybridization with adenosine-rich targets including 
5-methyluridine (^U) (19) and 5-(l-propynyl)-deoxyuridine 
(PdU). The former is a naturally occurring post-transcriptional 
modification in several tRNA species (20) while the latter has 
been employed in antisense oligonucleotide gene expression 
inhibition studies (2 1 ). The enhanced thermodynamic stability of 
these modified uridine-containing base pairs, shown in Figure 1, 
has been postulated to be due to more favorable stacking 
interactions and entropic factors (2 1 ,22). These entropic factors 
may stem from the enhanced displacement of highly ordered 
water molecules from the duplex due to these modified uridines 
(21,22). 2,6-Diaminopurine (DAP) is a modified adenine base 
which enhances binding affinity to thymine although having 
significant affinity to other bases in some sequence contexts 
(23,24). This modified adenine has been proposed to increase the 




nucleotide poalttan 



Figure 2. Hybridization signal intensities of modified pyrimidine antisense 
targets. Averaged fluorescence intensities (two experiments) of targets hybridized 
(buffer A, 40°Q to arrayed BRCAl perfect match probes are shown. 
Fluorescent signal intensities from perfect match probes complementary to nt 
2500-3000 of the BRCAl cDNA sequence are plotted on the log scale y-axis 
with the corresponding nucleotide position listed on the *-axis. Dark blue, dark red 
and green lines represent data from unmodified, «U and PU targets respectively. 
Rolling average percentages (9 nt window size) of array A T and A-G content are 
plotted in light blue and pink respectively relative to the right y-axis. 

stability of base pairs with thymidine (Fig. 1) due to altered 
stacking interactions, the formation of an additional hydrogen 
bond in the modified base pair and removing the spine of 
hydration in the minor groove (23,24; Fig. 1). 

Effects of modified nucleoside triphosphates on in vitro 
transcription reactions 

In vitro transcription reactions were performed in the presence of 
ATP, CTP (including biotin-15-CTP for post-hybridization array 
staining with a phycoerythrin-streptavidin conjugate), GTP and 
UTP. When examining 5-methyl-UTP (^UTP) and 5-(l-pro- 
pynyl)-UTP (PUTP) incorporation, UTP was excluded. Likewise, 
ATP was excluded when transcribing with diaminopurine 
riboside 5'-triphosphate (DTP). For T3 and T7 RNA polymerase 
in vitro transcription reactions, ^UTP did not significantly effect 
transcription yield relative to UTP based on ethidium bromide 
staining of the 3.43 kb exon 1 1 transcription products on agarose 
gels (data not shown). Substitution of PUTP for UTP in T3 RNA 
polymerase-mediated transcription reactions caused an -4-fold 
decrease in ethidium bromide stained reaction exon 1 1 transcription 
product In the analogous T7 RNA polyrr^rase-rnediated reaction, 
an - 1 0-fold decrease in exon 1 1 transcription product was found 
DTP caused a more dramatic decrease in exon 11 transcription 
product, 10- and 20-fold for T3- and T7-mediated transcription 
reactions, respectively. All reactions containing both DTP and 
me UTP or PUTP showed gready diminished transcription product 

yields (>50-fold). 

Equivolume amounts of transcription products were fragmented 
and diluted into hybridization buffer (Materials and Methods). 
Target concentrations were not adjusted since it is not cost 
practical to increase transcription reaction volumes for PUTP and 
DTP in vitro transcription reactions. Fiirmermore, co-transcnption of 
the smaller exons increases sample throughput and decreases 
reagent usage but produces a mixture of RNA species which are 
difficult to resolve and quantitate by gel electrophoresis. In theory 
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Table 2. Sequence tracts with enhanced hybridization due to analog 
incorporation 



Sub 1 Strand* Tract 1 " 



Sequence 



Gain 



DAP 



Figure 3. Binned hybridization signal intensities of modified antisensc :t«pa 

JTer Z listed value, except for the fust two Mn. which comspon £ 
S match probes having between 0 and 500 and those having between 500 
EfSoO fluoLcence intensity units) K^^^^S^^ 
red, green and black lines represent data acquired from unmodif ied, u, u 
and DAP targets respectively. 

it is possible to normalize the concentration of unmodified targets 
relative to the 5-(l-propynyl)-uridine ^ d ^chaminopunne 
modified targets. However, this would globally reduce hybndrzation 
for unmodified targets, making it difficult *J^yV»^ 
signals from all oligonucleotide probes. Therefore, the results 
generated from 5-(l-propynyl)-uridine and 
(and to a lesser extent 5-methyluridine) containing targets should 
be taken as qualitative rather than quantitative. Our goal was to 
elucidate how modified nucleoside triphosphate incorporation 
would affect the performance of BRCA1 target hybndization 
within the context of previously established assay conditions (7). 

Hybridization properties of unmodified targets 

The intensity of specific target hybridization may be shown by 
plotting hybridization signal strength to each perfec .match probe 
^mKtotide position (Fig. 2). The fluorescent hybn^Uon 
signal of perfect match probes varies over 130- and 230-fold 
(sinse strand data) and 250- and 620-fold (antisense strand data) 
for unmodified target at 40°C in buffers A and B, respecUvely 
(based on averages of the 10 highest and lowest hybndizaUon 
signals). Localized decreases in hybridization signal cannot be 
fully accounted for by thermodynamic parameters based upon 
target A/U or pyrimidine content (Fig. 2) but presumably also 
reflect potential intra- and intermolecular target and/or probe 
structures that inhibit hybridization (10). Of note is the use of 
tetramethylammonium (TMA) salts (25) which along with 
betaine (26), have been widely used to minimize differences in 
oligonucleotide hybridization due to A T content. These effects 
have been attributed to anon^ooperative differenual stabuization of 
AT (or A U) base pairs, relative to G C base pairs within duplex 
nucleic acid (25). Nevertheless, these buffers do not completely 
ameliorate energetic differences in the hybridization of short 
oligonucleotide targets (27). Although TMA+ countenons altered 
hybridization to subsets of arrayed oligonucleotides relative to 



p u 



DAP 



AS 
AS 
AS 
AS 
AS 

AS 
AS 
AS 
AS 
AS 

AS 
AS 
AS 
AS 
AS 



3062-3091 
3488-3509 
3799-3808 
2984-2988 
1652-1665 

360- 408 
3491-3508 

355- 364 
4B28-4B57 
5531-5543 

B26- 849 
3213-3220 

997-1003 
1618-1623 
1047-1053 

2309-2322 

4154- 4164 
3448-3460 
1626-1632 
2071-2076 

2309-2321 
3160-3165 
3216-3224 

4155- 4161 
3450-3458 

3634-36S5 
3893-3912 
254- 263 
3204-3233 
1433-1450 



8.8 
4.3 
3.3 
3.2 
2.9 



ACCACTTTTTCCCATCAAGTTCATTTTGTT 7 . 2 

cagaTTTCTCTCCATATCTGATTTCAgata 6 . 5 

ccctgcttccAACACTTGTTatttggtaaa 5 . 6 

ggttttgtctatcATCTCagttcagaggca 4.8 

acatcaggCCTTCATCCTCAGGatcttatc 4 . 5 

tGAAATCATTTGTGCTTTTCAGCTTGACAC 
agat t tCTCTCCATATCTGATTTCagataa 
agtacgagatTTAGTCAACTtgttgaagag 
TCTTCTCTGATGACCCTGAATCTGATCCTT 
cacaggtgtCCACCCAATTGTGgt tgtgca 

ataCTGAACATCATCAACCCAGTAATAatg 7.3 

taataacattAGAGAAAAtgtttttaaaga 6.9 

gtttattactcaCTAAAGAcagaatgaatg 6.2 

gtcccctcacaaATAAATtaaagcgtaaaa 5.9 

ataCTGAACATCATCAACCCAGTAATAatg 5 . 6 

agagaagaAAAAGAAGAGAAACtagaaaca 20.1 

cagatgatgaAGAAAGAGGAAcgggcttga 11.3 

aaataaaaaAGCAAGAATATGAagaagtag 10.9 

acaaataaattoAAGCGTAaaaggagacct 10.9 

gtgaagagataaAGAAAAaaaagtacaaca 9.7 

agagaagaAAAAGAAGAGAAACtagaaaca 174.0 

ctgaaagagaaaTGGGAAatgagaacattc 35.0 

ataacattagaGAAAATGTTtttaaagaag 30.5 

tcagatgatgaaGAAAGAGgaacgggcttg 28.6 

aaataaaaaagCAAGAATATgaagaagcag 27 . 7 

taaggAAAGTTCTGCTGTTTTTAGCAAaag 
g aggaGAATTTATTATC ATTGAAGAa t age 
TAATTTATAGTTTTGCATGCtgaaac 1 1 
AATAACATTAGAGAAAATGTTTTTAAAGAA 
tcatgaGGCTTTAATATGTAAAAGtgaaag 



5.0 
4.6 
4.3 
4.1 
4.0 



iTarget substitutions with «U, 5-methyluridine and PU, 5-d-propyny^uridine. 
2s sense and AS, antisense strand data. . 
3 E xonic nucleotide tracts with peak hybridization enhancements relative to 
unmodified targets (40° C, Buffer A). • a ^r^ 

'Tracts (BRCAl sense strand nucleotide sequence shown) displaying the rive 
largesthybridization signal increases relative to unmc<lified targets. Ital^zed 
,2rs indicate intronic sequence and uppercase letters represent nucleotides 
with highest enhancement levels. 
'Maximum level of hybridization signal enhancement. 

Na + countenons, they did not produce a globally uniform 
hybridization pattern (data not shown). 

Hybridization properties or modified targets 

The range of fluorescent signal intensities was narrowed some- 
what by «U and PU incorporation (Fig. 2). Global effects on the 
hybridLuon signal strength of «U. PU and DAP modified 
Lets were further characterized by calculating the number of 
nucleotide positions having signal intensities within «fegon»d 
bins (Fig. 3V Relative to unmodified antisense target, «U- and 
PU-substituted targets showed hybridization signals shifted 
towards overall higher values while those containing DAP shifted 
towards lower values. Relative to unmodified sense strand target, 
™U- and DAP-substituted targets showed similar average 
hybridization signal intensities (differing <13-&M* tower 
^-substituted target had an -2-fold decreased hybnization 
signal (buffer A. 40°C). This may result from lowered transcnption 
reaction yields as well as from intra- and/or intermolecular target 

TS» localized changes in hybridization signal due to 
modified nucleotide incorporation, we calculated th : ratio of 
Jerfect match probe hybridization signal intensity of modified to 
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nucleotide position 

Figure 4. Relative hybridization intensities of modified pyrimidine antisense targets. Average fluorescence intensities of targets (two experiments) hybridized (buffer A, 40°C) 
to perfect match probes corresponding to exon 1 1 of the BRCA1 oligonucleotide arrays were quantitated. Ratio of perfect match probe hybridization signals of (A) 
target relative to unmodified target and (B) PU target relative to unmodified target are given. 



unmodified target for each sense and antisense strand nucleotide 
(Fig. 4). The largest signal enhancements in "^U- and PU-substituted 
targets were primarily found in pyrimidine-enriched tracts 
containing a large number of uridine residues (Table 2). 
Homopyrimidine tracts are preferentially localized on the anti- 
sense BRCA1 strand with 15 sequence tracts>10 nt long and only 
one such homopurine tract on this strand. Since DNA-RNA 
hybrid duplexes containing homopyrimidine RNA tracts are less 
stable than hybrid duplexes of identical sequence containing 
homopurine RNA tracts (28), modified uridine analogs have the 
best opportunity to significantly affect hybridization in these 
sequence contexts (Fig. 2). This could explain why "^U and PU 
substitutions have a greater positive impact on hybridization 
signal from antisense than from sense target strands. While 
DAP-substituted targets also show regions of enhanced signal 
(Table 2) they are significantl y less pronounced than the highest 
found in the pyrimidine-substituted targets. This may reflect the 
relative stability of DNA-RNA hybrids containing unmodified 
homopurine RNA strands. 

Unmodified and "^U-substituted targets show similar single 
nucleotide mismatch destabilization properties on both strands 
under a variety of assay conditions (Table 3 and Figs 5 and 6). 
Significantly, this selectivity is maintained with the antisense 
strand where me U incorporation produced the greatest localized 
hybridization signal enhancements (Fig. 4 and Table 2). When 
comparing hybridization properties of ^U-substituted (buffer A, 
42°C) relative to unrnodified (buffer A, 40°C) targets, there was 
only a 2.3-fold average decrease in single nucleotide mismatch 
discrimination in the five antisense strand nucleotide tracts (Table 2) 
with the highest levels of signal enhancement. Lower temperatures 
are used in this comparison for unmodified targets due to 



diminished, thus less reliable, hybridization signal in these 
sequence tracts at 42 °C. 

The global single nucleotide mismatch specificity of 
probe-target interactions decreases when comparing PU-substituted 
(42 °C) and unrnodified targets under identical hybridization 
conditions (Table 3 and Fig. 5). This decreased specificity is 
highlighted in areas of the greatest probability of PU incorporation. 
When comparing hybridization properties of PU-substituted (buffer 
A, 42°C) relative to unmodified (buffer A, 40°C) targets, there 
was a 5.1 -fold average decrease in single nucleotide mismatch 
discrimination in the five antisense strand nucleotide tracts (Table 2) 
with the highest levels of signal enhancement 

An example of the localized effects of modified pyrimidine 
substitutions on target hybridization signal strength from perfect 
match and single nucleotide mismatch probes is shown in 
Figure 6. While "^U incorporation enhances hybridization signal 
and maintains a specific hybridization partem to match probes in 
both sequence contexts, PU incorporation increases hybridization 
signal at the expense of single nucleotide mismatch specificity 
(Fig. 6c and 0- Importantly, significandy increased PU- substituted 
target cross-hybridization often occurs in all four probes interro- 
gating a single target nucleotide. Increased cross-hybridization to 
other areas of the array is also found with PU-substituted relative 
to unmodified oligodeoxyribonucleotides when hybridized in 
buffer A (data not shown). 

DAP-substituted sense targets produce significantly stronger 
cross-hybridization to single base pair mismatch probes than 
unmodified targets (Table 3). Nevertheless, when comparing 
hybridization properties of DAP-substituted (buffer A, 42°C) 
relative to unmodified (buffer A, 40°C) targets, there was a only 
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Table 3. Single nucleotide mismatch specificity ratios 



unmodified 



5-methyturidine 



5-0 -propynyQ-uridine 



2,6-diaminopurine 



Temp *C 



35 40 42 45 

z — 



35 40 42 45 



35 40 42 45 



35 40 42 



Bin' 



T 



% Frequency 



% Frequency 



% Frequenc y 
2.6 



% Frequency 
17.B 



< 1.2 
1.2-2.0 
2.0 - 3.0 
3.0 - 4.0 
4.0 - 5.0 
5.0 - 6.0 
6.0 - 7.0 
7.0 - 8.0 
8.0 • 9.0 
9.0-10.0 

>10.0 



n-o3 

25.5 16.0 
30.7 30.3 

17.6 20.5 
10.5 13.1 

5.7 7.6 

3.4 4.2 
2.1 3.1 
1.0 1.7 
0.7 0.9 

1.5 2.3 



0.4 0.3 
13.8 6.5 
32.0 24.8 
23.3 25.5 
13.5 17.5 
7.3 10.6 

4.0 6.1 

2.1 3.3 
1.3 2.3 
0.6 1.1 
1.7 2.2 



1.5 0.5 
32.0 22.9 
36.4 39.2 
16.3 19.8 
7.0 8.8 
3.2 4.3 
1.5 2.1 
0.9 1.0 
0.5 0.6 
0.2 0.3 
0.5 0.7 



0.4 0.4 

21.3 18.3 
37.6 41.5 
21.0 22.7 

10.4 10.1 
4.5 3.8 
2.2 1.6 
1.2 0.8 
0.6 0.3 
0.4 0.2 
0.5 0.3 



7.2 3.5 
35.2 29.1 

33.2 39.0 

14.3 16.8 
5.5 6.6 
2.4 2.6 
1.1 1.2 
0.5 0.5 
0.3 0.3 
0.1 0.2 
0.2 0.3 



1.5 
21.4 16.5 
41.7 43.9 
22.0 24.1 
7.6 8.7 



2.7 
1.3 
0.5 
0.2 
0.1 
0.1 



2.8 
1.3 
0.7 
0.2 
0.2 
0.1 



20.4 13.9 
43.3 45.0 34.9 
20.9 22.6 25.4 
8.0 8.8 11.1 
3.8 4.6 5.3 
1.6 2.3 2.8 
0.8 1.2 1.3 
0.4 0.6 0.6 
0.2 0.4 0.4 
0.1 0.1 0.2 
0.4 0.3 0.3 



Bin 



% Frequency 



% Frequency 



% Frequency 

1.2 1.1 
26.1 20.9 
37.6 41.4 
18.6 20.9 

8.3 9.1 



% Frequency 
19.21 



< 1.2 
1.2-2.0 
2.0 - 3.0 
3.0 - 4.0 
4.0 - 5.0 
5.0 - 6.0 
6.0 - 7.0 
7.0 - 8.0 
8.0 - 9.0 
9.0-10.0 

> 10.6 



3.6 1.1 
38.8 26.1 
28.0 31.9 
13.5 18.8 



7.1 
3.4 
2.0 
1.3 
0.7 
0.4 
1.1 



10.7 
5.4 
2.8 
1.5 
0.7 
0.4 
0.7 



1.0 
24.7 
32.1 
19.1 
11.0 
5.8 
2.8 
1.5 
0.8 
0.5 
0.6 



0.9 
24.7 
32.2 
19.3 
10.7 
5.6 
2.9 
1.5 
1.0 
0.6 
0.7 



3.2 1.3 
38.6 34.5 
30.6 34.3 
14.3 15.8 

6.3 7.3 
3.3 3.2 
1.6 1.8 
1.0 0.8 
0.6 0.5 
0.3 0.3 
0.3 0,3 



0.6 0.5 
19.3 31.0 
34.6 37.9 
21.6 16.9 
12.1 7.2 



6.4 1.6 
39.3 26.5 
32.3 38.4 
13.0 18.8 



5.6 
2.8 
1.5 
0.8 
0.5 
0.8 



32 
1.5 
0.8 
0.5 
0.2 
0.2 



5.3 
2.1 
0.9 
0.4 
0.2 
0.1 
0.1 



8.0 
3.6 
1.6 
0,7 
0.4 
0.2 
0.2 



4.2 
1.9 
0.9 
0.5 
0.3 
0.3 



3.9 
1.4 
0.8 
0.3 
0.1 
0.1 



46.4 22.3 

39.2 49.1 50.1 

10.3 18.9 19.6 
2.7 6.2 6.5 
0.8 1.9 2.6 
0.5 0.9 1.0 
0.1 0,4 0.6 
0.1 0.2 0.2 
0.0 0.1 0.1 
0.0 0.0 0.1 
0.0 0.0 0.1 



Bin 



% Frequency 



% Frequency 



% Frequency 

3.8 " " 



% Frequency 
2.9 0.9 0.5 
24.0 14.8 15. 

30.6 26.3 33.2 

18.7 20.7 22.2 



<T2 
1.2-2.0 
2.0 - 3.0 
3.0-4.0 
4.0 - 5.0 
5.0 - 6.0 
6.0 - 7.0 
7.0 - 8.0 
8.0 - 9.0 
9.0-10.0 
> 10.0 



0.8 0.2 
16.6 7.4 
28.8 20.8 
19.2 21.1 
12.1 15.6 

7.4 10.4 

4.2 7.3 

3.3 5.2 
2.0 3.1 

1.4 2.4 
4.2 6.7 



0.3 0.5 
4.3 4.0 
14.9 17.2 
20.9 26.6 
17.2 20.8 
12.8 12.5 
8.7 7.2 

6.0 4.2 

4.1 2.6 
2.9 1. 
7.9 2.8 



1.1 0.1 
17.6 11.5 

33.1 31.2 

23.2 24.2 
11.8 13.4 

6.0 7.5 



3.0 
1.4 
1.0 
0.6 
1.1 



4.7 
2.7 
1.6 
1.0 
2.0 



0.2 0.3 
7.5 7.1 

28.0 29.7 

24.1 28.5 
16.4 17.0 

8.8 8.7 
5.7 4.4 
3.3 2.2 
2.2 1.: 
1.2 0.4 
2.5 0.5 



1T3 

4o!5 32.0 
34.4 37.1 
10.8 16.0 



3.1 1.1 
27.5 16.8 
39.0 39. 
18.4 24.7 



3.2 
1.0 
0.5 
0.2 
0.1 
0.0 
0.1 



6.1 
2.6 
1.1 
0.6 
0.2 
0.2 
0.3 



7.0 
2.7 
1.2 
0.5 
0.2 
0.1 
0.2 



11.0 
4.3 
1. 
0.8 
0.3 
0.1 
0.2 



10.0 
5.1 
3.0 
1.8 
1.1 
0.7 
2.1 



13.5 11. 
8.3 6.7 



5.4 
3.5 
2.2 
1.3 
3.2 



4.0 
2.2 

1.; 

0.9 
2.2 



% Frequency 



% Frequency 



% Frequency 
72 3.9 2.8 
32.5 26.7 24.2 
28.0 29.8 30.4 
14.0 16.1 17.2 
7.6 8.4 9.6 

3.6 4.9 5.2 
2.3 3.0 3.4 
1.5 22 2.2 
1.0 1.3 1 
0.7 0.9 1 

1.7 2.8 2.3 



Dill 

< 1.2 


1.7 


1.6 1.8 


2.6 


1.2-2.0 


14.1 


13.3 11.2 


15.2 


2.0 - 3.0 


26.2 25.6 23,6 29.8 


3.0 - 4.0 


19.4 


19.6 20.2 22.5 


4.0 - 5.0 


12.5 


12.9 13.9 


13.2 


5.0 - 6,0 


8.1 


8.6 9.2 


7.4 


6.0 - 7.0 


5.4 


5.6 6.0 


4.2 


7.0 - 8.0 


3.5 


3.6 4.1 


2.3 


8.0-9.0 


2.6 


2.5 3.0 


1.3 


9.0- 10.C 


) 1.8 


1.9 1.9 


0.6 


>10.0 


4.8 


4.8 5.0 


0.8 



0.9 0.3 

16.3 13.0 

31.4 31.8 
21.7 22.5 
11.6 13.3 



6.9 
4.0 
2.5 
1.7 
1.0 
1.9 



7.6 
4.5 
2.6 
1.7 
1.0 
1.7 



0.2 0.4 
12.7 12.B 
33.0 31. 
22.6 23.6 
13.5 14.4 
7.2 8.4 



4.5 
2.4 
1.6 
0.7 
1.5 



3.8 
2.1 
1 

0.8 



3.2 
35.1 
37.7 
14.1 
5.7 
2.1 
0.9 
0.5 
0.2 
0.1 
0.3 



0.7 
17.4 
36.0 
21.9 
10.9 
5.3 
3.0 
1.9 
1.1 
0.5 
1.1 



0.5 0.5 
17.5 21. 
36.7 41. 
20.9 19.4 



11.1 

5.5 
3.3 
1.9 
1.1 
0.6 
0.9 



9.1 
4.5 
1 
0.9 
0.5 
0.2 
0.2 



'Hybridization temperature. 

2 Single nucleotide mismatch specificity ratio bins. 

3 % of BRCA1 coding nucleotide positions within the specified ratio bin. 



a 2.3-fold average decrease in single nucleotide mismatch 
discrimination in the five sense strand nucleotide tracts (Table 2) 
with the highest levels of signal enhancement. Therefore, 
cross-hybridization occurs in different sequence contexts distributed 
throughout the array. Furthermore, DAP-substituted antisense 
targets showed single nucleotide mismatch specificities similar to 
unmodified targets, presumably due to the decreased number of 
adenine-rich tracts on this strand and thus lower levels of DAP 
incorporation. For both sense and antisense analysis, lower 
overall transcriptional yields of DAP-modified targets result in a 
lower target concentration in the hybridization reaction. This 
increases the stringency of the hybridization reaction and 
consequently increases single nucleotide mismatch discrimination. 



Increasing the DAP-modified target concentration to produce 
more robust hybridization signals, especially for antisense strand 
targets, will result in lower single nucleotide discrimination, thus 
reducing the usefulness of this modification. 

Co-incorporation of DAP and modified uridine residues into sense 
and antisense targets had a significantly negative effect on 
hybridization specificity (data not shown). A large component of this 
problem is the decreased transcription product yield (>50-fold). It 
does not appear feasible to simultaneously incorporate modified 
pyrimidine and purine nucleotide triphosphates and retain robust 
product yield which has non-degenerate hybridization specificity 
with these enzymes under these conditions. In the future, RNA 
polymerases may be engineered to have increased ability to 
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fraction of total nucleotide positions 




0.016 



fraction of total nucleotide positions 



Rg „ re5 .Sin gl e nU c IeoM ^ 
mismatch probctakenfioma™,^ 



incorporate modified nucleoside triphosphates into transcription 
products. 

Potential applications for modified targets 

The examined modified triphosphates can be incorporated into 
large RNA transcripts by T3 and T7 RNA polymerases. While 
DAP and PU incorporation both lead to enhanced hybridization 
in specific sequence contexts, the loss of binding specificity 
reduces the likelihood of their use in mutation screening analysis. 

incorporation enhances target hybridization signals within 
specific sequence contexts and does not substantially increase 



hybridization signal to single nucleotide mismatch probes. 
Enhanced RNA target hybridization signals will be especially 
important in the mutational analysis of genes having localized 
regions of strongly biased A T sequence content (i.e BRCA2 and 
ATM genes). Furthermore, in conjunction with modified oligo- 
nucleotide surface probes (Fidanza et aU unpublished observations) 
it may be possible to normalize the binding affinity of all perfect 
match probes in an array (29). This would potentially allow 
hybridization conditions to be used which globally optimize 
hybridization signal strength and specificity. 

Modified nucleoside triphosphate usage could also benefit 
RNA expression monitoring experiments based on hybridization 
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AGAAGAAAAAGAAGAGAAACT 



having affinities for small molecules or specific catalytic 
properties (31). Increasing the stability of A-U base pairs using 
this strategy could expand the variety of nucleic acid structures 
which may have distinct biophysical properties. 




AGAAGAAAAAGAAGAGAAACT ACKNOWLEDGEMENTS 




AGAAGAAAAAGAAGAGAAACT 




AGATAAAGAAAAAAAAGTACA 




e el 

* AGATAAAGAAAAAAAAGTACA 



f 




AGATAAAGAAAAAAAAGTACA 



Fieure 6. Modified pyrimidine target image comparisons. Magnified digitized 
false color images showing hybridization pattern of BRCA / antisense targets 
(buffer A, 40°C). Brightness and contrast settings are changed in each panel to 
increase image clarity. Nucleotide identities, determined through *d««y- 
sequencing analysis, are given under the respective column and colored redor 
blue if correctly or incorrectly identified by hybridization analysis (perfea 
match probe intensity being at least 1*™^^^^^*? 
mismatch probe intensity), respectively. Several base calls may be difficult to 
visualize due to limitations in printing technology as weU as in the linear range 
of the human eye for detecting rnonochromatic color changes, (a-c) Hybridization 
patterns of unmodified, °*U and PU antisense strand targets, respectively, to 
m2303-2323 of BRCA1 cDNA. (cM) Hybridization * 
racy and PU antisense strand targets, respecUvely, to nt 2065-2085 ot BKLA1 

cDNA. 



to high density oligonucleotide arrays (S). In such experiments, 
perfect match probe oligonucleotides are selected based upon 
sequence composition effects to produce robust and specific 
hybridization signals from RNA targets (8). Targets containing 
modified bases may have increased affinity towards a number of 
perfect match probes previously giving a poor hybridization 
signal This would expand the variety of oligonucleotide probes 
which could be used in these experiments and allow increased 
freedom in selecting probes placed in strategic positions 
(i.e. splice junction sequences so as to monitor the expression of 
differentially spliced RNA transcripts). 

Modified RNA transcripts can also be used to analyze the 
biophysical properties of nucleic acid structures (17,1S,30). 
Others have incorporated modified bases into RNA during m 
vitro selection assays to expand the repertoire of RNA species 
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Background: The power of DNA microarrays derives from their ability -to monitor ^ the 
!xpref sion levels of many genes in parallel. One of the limitations of such powerful analytical tools 
is the inaoi lity o detect certain transcripts in the target sample because of artifacts caused by 
ba ild noise or poor hybridization kinetic, The use of base-modified analogs of nucleos de 
^Z^. bee! shown to increase complementary duplex stability *f~«*™£ 
heJe we attempted to enhance microarray hybridization signal across a vv.de range of sequences 
and expression levels by incorporating these nucleotides into labeled cRNA targets. 
Results- RNA samples containing 2-aminoadenosine showed increases in signal intensity for a 
mt o^ of the sequences. These results were similar, and additive, to t ose seen with an ,nc ase 
Z Z hybridization time. In contrast, 5-methyluridine and S-methylcytidine decrea ed »gn* 

ntenLes Hybridization specificity, as assessed by mismatch controls, was dependent on both 

^ s^ 

of Modified and unmodified ATP in a .:. ratio resu.ted in signified, ' r-t™nb« of above- 
threshold ratio calls across tissues, while preserving ratio integrity and reproducibility. 
Conclusions: Incorporation of 2-aminoadenosine triphosphate into cRN A targets ^ promising 
method for increases signal detection in microarrays. Furthermore, this approach can be 
optimized ro minimize 8 infpact on yield of amplified mater*, and to increase the number of 
expression changes that can be detected. 



Background . 

DNA microarrays have been widely adopted in genomics 
because of the their ability to simultaneously examine the 
expression levels of thousands of genes. As a result, the 
scope of applications for microarrays has broadened rap- 
idly, from drug discovery [1], to classification of cancers 
[2-4] and analysis of splice variants [5]. Novel analytical 



tools have been constructed to address every component 
of the microarray experiment and optimize performance 
[61 However, ideal systems with maximized sensitivity 
and data reproducibility have not been achieved. One ap- 
proach to enhance sensitivity in microarrays, using a nov- 
el signal amplification technique, has recently been 
reported [7[. Another approach is to increase the affinity 
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of a probe (nucleic acid present on the array) for its target 
through modifications to the length [8], chemistry [9], or 
physical structure [10] of the probe. 

Naturally occurring analogs of purine and pyrimidine bas- 
es have been examined extensively for their ability to in- 
crease the thermodynamic stability of DNA:DNA and 
DNA:RNA duplexes [11-15]. Among these are 2-ami- 
noadenine also known as diaminopurine (DAP), which is 
found in S-2L cyanophage DNA [16], 5-methyl uracil 
(MeU), and 5-methyl cytidine (MeC). Previous studies 
have shown the ribonucleoside triphosphate derivatives 
of modified bases to be effectively incorporated by RNA 
polymerases [17,18], which make them an excellent sub- 
strate for use in microarray sample preparations requiring 
amplification through in vitro transcriptions (IVTs). 

Due to their effects on duplex stability and secondary 
structures and their lack of replication by DNA polymer- 
ases, modified nucleotides have recently been exploited in 
a variety of technologies in molecular biology and genom- 
ics. For example, 2 , -0-methyl ribonucleotides and 5-(l- 
propynyl)pyrimidines have been chemically incorporated 
into oligonucleotides which were used to detect telomeric 
repeat sequences in fluorescence in situ hybridization 
(FISH) assays [19]. Furthermore, chimeric primers con- 
taining deoxynudeotides and 2'-0-methyl ribonucle- 
otides have been used to eliminate artifacts, produced by 
exponential amplification of minor side-products, in cy- 
cle sequencing [20]. 

Recently, studies employing DNA arrays have also exam- 
ined the use of modified nucleotides such as DAP, 5-bro- 
modeoxyuridine, and 2'-0-methylthymidine in the probe 
[21,22]. An alternative method to increase the probe-tar- 
get affinity is to incorporate the modified ribonucleotide, 
as the triphosphate derivative, during the IVT process so 
that the cRNA produced has the desired level of substitu- 
tion of the corresponding unmodified nucleotide. Such 
an approach has been demonstrated in a study which ap- 
plied these modified cRNA products to high density oligo- 
nucleotide arrays [23]. However, that study measured the 
signal when amplifying a specific gene rather than a ge- 
nome-wide approach. A genome-wide amplification, cou- 
pled with the incorporation of modified nucleotides, 
permits measurement of every transcript in the sample. To 
our knowledge, no systematic studies have been per- 
formed examining the incorporation of modified ribonu- 
cleotide triphosphates into cRNA, yield of amplified 
material, effects on hybridization intensities and repro- 
ducibility, and the impact on the differential expression 
ratios. It will not be possible to gauge the full potential, 
advantages, and disadvantages until such studies have 
been completed. 



In this study, we investigated the possibility that incorpo- 
ration of modified nucleotides into complementary RNA 
(cRNA) target samples could increase signal intensity on 
the Motorola Codelink™ Expression microarray platform. 
The ratios of modified to unmodified nucleoside triphos- 
phates (NTPs) were varied in each target synthesis in order 
to measure the range of effects on cRNA yield, specific ac- 
tivity of target sample, hybridization signals, and differen- 
tial expression ratios. Our results suggest that 
incorporation of 2-aminoadenosine (DAP) triphosphate 
into target cRNA samples may increase hybridization sig- 
nal intensity for a wide variety of RNA: DNA hybrid du- 
plexes. In contrast, 5-methyluridine and 5-methylcytidine 
had detrimental effects on signal intensities. Hybridiza- 
tion specificity was dependent on both target sequence 
and extent of substitution with the modified nucleotide. 
Concurrent incorporation of modified and unmodified 
ATP in a 1:1 ratio resulted in significantly greater numbers 
of above-threshold ratio calls across tissues, while preserv- 
ing ratio integrity and reproducibility. 

Results 

Effect of modified nucleotides on cRNA yield and assess- 
ment of their incorporation and biotin-l l-UTP incorpora- 
tion 

Although DAP, MeC, and MeU were tolerated by T7 RNA 
polymerase, incorporation of these analogs into our IVT 
reaction cocktails reduced the yields of amplified cRNA. 
The magnitude of the decrease was 20-30% and depend- 
ed on RNA tissue source and the NTP modification. How- 
ever, within the context of our pre-established assay 
conditions (a single color, single sample per array system 
using ten micrograms of cRNA), enough cRNA was gener- 
ated from each IVT reaction to perform hybridizations in 
triplicate. Thus, we were able to normalize input amounts 
of cRNA for each condition tested in order to quantitative- 
ly compare relative changes due to each respective modi- 
fication. In these experiments, we generated cRNA from 
five micrograms of total RNA and did not explore the im- 
pact on yield when smaller or larger amounts of input to- 
tal RNA are used. 

We next determined how well the modified NTPs were in- 
corporated by the 17 RNA polymerase and whether incor- 
poration of biotinylated UTP was altered due to the 
presence of these modified NTPs. To address these ques- 
tions, we used an analytical method developed in our lab- 
oratory and described in a previous study [24]. Briefly, the 
complex cRNA is digested with PI nuclease and calf intes- 
tinal phosphatase and applied to a high performance liq- 
uid chromatography (HPLC) column to separate the 
nucleosides, followed by an absorbance measurement at 
260 nm, As seen in Figure 1A, when only the unmodified 
NTPs are incorporated during the IVT, there is good sepa- 
ration of the individual nucleosides. Furthermore, as re- 
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Figure I 

HPLC anal/sis of digested cRNA demonstrates incorporation of modified nucleotides and their resolution from unmodified 
counterparts. Absorbance profiles at 260 nm are shown for (A) control, (B) I : I DAP:A, (C) fully substituted DAP, and (D) fully 
substituted MeC conditions. Proportions of each nucleoside were calculated using peak areas and extinction coefficients. The 
peaks for cytosine, uridine, guanosine, and adenosine show up in the unmodified control sample (A) at approximately 5.0, 7.2, 
1 2.6, and 1 5.6 min, respectively. 



ported earlier [24], using the extinction coefficients and 
integrating the area under these peaks, there are approxi- 
mately equal amounts of each of the four nucleosides. 
When DAP was added at a 1:1 molar ratio to the adenos- 
ine, the chromatogram showed incorporation of DAP into 
the cRNA, and this level of incorporation appeared to be 
equivalent to the level of incorporation of adenosine (Fig- 
ure IB). Moreover, when the adenosine was fully substi- 
tuted by the DAP in the IVT reaction cocktail, only a peak 
corresponding to the DAP was detected (Figure 1C), and 
this peak was approximately equal in area to that of the 
adenosine in the control situation. Similarly, when MeC 
was fully substituted for cytidine in the IVT reaction cock- 
tail, only a peak corresponding to the MeC was detected 
(Figure ID), and this peak was approximately equal in 



area to that of the cytidine in the control situation. In con- 
trast, we were unable to examine incorporation levels of 
the MeU because the MelJ peak was eluted at approxi- 
mately the same time as the guanosine peak (data not 
shown). It is important to note that these analyses used 
the complex cRNA and represent a global, average view of 
the incorporation. Incorporation may differ somewhat 
depending on the sequence, structure, or expression level 
of the nascent RNA transcript. We therefore incorporated 
each of these modified nucleotides into a unique bacterial 
transcript, generated by run off transcription from a pi as- 
mid into which the gene was cloned. When this transcript 
was digested and applied to the HPLC, we found similar 
results as seen with the complex cRNA (data not shown). 
We conclude that these modified nucleotides are incorpo- 
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Mod"fied purines and pyridines show differential effects on hybridization intensities when incorporated 

Modified and unmodified (Control) samples were normalized for concentration, fragmented, and hybridized onto Human Urn- 

"^m^ except for S-methy. CTP samples, which were hybridized in ^P 1 '^^^ 

probeT spotted* 6-fold' redundancy. Average signal intensities after ground subtract.on were f^^g^V^ 

Human Uniset I array for the control samples versus (A) control, (B) fully DAP-subst,tuted. (C) 1:1 subsftuted DAP, (D) fully 

meC-substituted, and (E) fully MeU-substituted. 



rated during the IVT reaction at ratios which are reflective 
of the input ratio to the unmodified counterpart. 

Because the sensitivity and reproducibility of microarrays 
depends on the specific activity of target cRNA, we also 
wanted to determine if adding modified NTPs to the IVT 
changed the incorporation rate of biotin-1 1 -UTP. This bi- 
otinylated nucleotide is used in our biotin-streptavidin- 
Alexa647 conjugate detection system. We found that the 
level of biotinylated uridine, detected at 294 nm after di- 
gestion and application to the HPLC, was unaffected by 
the presence of any of the modified NTPs in the IVT reac- 
tion cocktail (data not shown). Moreover, analysis of in- 
dividual transcripts, which were enzymatically digested 
into mononudeosides, showed equivalent incorporation 



rates of biotin-1 1-UTP for both control samples and target 
samples containing modified ATP. 

Lastly, we examined what effect, if any, incorporation of 
modified NTPs had on the length of the amplified cRNA. 
Transcript size was determined by running samples on an 
Agilent 2100 BioAnalyzer. No difference was observed for 
either individual transcripts or complex samples in modi- 
fied versus unmodified ATP samples (data not shown). 

Effect of modified nucleotides on hybridization signals and 
specificity 

We next determined the effect of these modified NTPs on 
hybridization intensities. After hybridization, scatter plots 
of hybridization intensities were generated comparing 
those intensities generated from unmodified cRNA (con- 
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Table I: Percent signal change due to nucleotide substitution. 



trol) to either the duplicate control hybridization or to in- 
tensities generated from cRNA where one of the modified 
NTPs was incorporated. Hie comparison of control versus 
control (Figure 2A) illustrates that, if hybridization inten- 
sities are equal in the two conditions, the points should lie 
on the diagonal. Complete substitution of unmodified 
ATP with DAP (Figure 2B) produced greater increases 
overall in individual probe signals than the target sample 
containing a 1:1 ratio of DAP to unmodified ATP (Figure 
2C), as would be expected if DAP incorporation increased 
the DNA:RNA hybrid duplex stability. A quantitative esti- 



mation of assay performance increase can be derived from 
the median probe signal on each microarray. Although 
Figure 2B and 2C show that increases in signal intensity 
are not linear for all probe sequences, median slide inten- 
sity across sets of duplicate hybridizations for two differ- 
ent tissues (human embryonic kidney and Burkitt's 
lymphoma) increased an average of 40% +/- 7% and 99% 
+/- 6% for the 1:1 DAP: A and all DAP conditions, respec- 
tively, over the unmodified control sample. However, Fig- 
ure 2B also shows that full substitution of adenosine by 
DAP reduced the hybridization signals for many of the 
probes, generating a bowing towards the unmodified con- 
trol condition and suggesting a duplex destabilizing effect 
of complete DAP substitution in certain sequence contexts 
[ 1 1,23]. On a global scale, such a high incidence of probes 
with reduced hybridization signals may limit the useful- 
ness of the complete DAP substitution. Of particular inter- 
est is the observation that the number of probes which 
have increased signal and the degree to which their signals 
are increased are highest at the lower end of the signal 
range. In contrast, both of the modified pyrimidine tri- 
phosphates (MeC and MelJ) tested resulted in a decrease 
in overall signal intensities for nearly all of the probes test- 
ed when compared against unmodified control samples 
(Figures 2D &2E). However, data points on the right of 
the diagonal in figures 2D and 2E also show that for a frac- 
tional set of target-probe pairs, substitution with MeC and 
MeLJ increased hybridization signal relative to the un- 
modified control. Those probes whose signal intensities 
were increased at least two-fold are summarized in Table 
1 . As Table 1 shows, for a number of transcripts substitu- 
tion with MeC or MeU may potentially augment hybridi- 
zation signals more so than even DAP. This may prove 
useful for target samples containing very low concentra- 
tions of these particular transcripts. Nevertheless, because 
our goal was to utilize modifications that would increase 
hybridization signals for the vast majority of target-probe 
duplexes, we subsequently focused our efforts on the sam- 
ples containing DAP. 

We next determined whether the increases in signal inten- 
sity observed with DAP substitution were similar in mag- 
nitude to those observed when the hybridization time is 
increased. We therefore hybridized two micrograms of 
cRNA with no modifications or two micrograms with DAP 
at a 1:1 molar ratio to adenosine for 18 hours and hybrid- 
ized two micrograms of cRNA with no modifications for 
42 hours. We generated two scatterplot and overplayed 
these scatterplot on each other (Figure 3A). The first scat- 
terplot compares the intensities of the 18 hour control 
with those of the 42 hour control (orange signals). The 
second scatterplot compares the intensities of the 18 hour 
control with those of the DAP-modified cRNA (blue sig- 
nals). The longer hybridization time increases the median 
signal intensity by 47% +/- 2%. The increase is more pro- 
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MeC and MeU substitution increases signal intensities for a small 
number of probes. Relative percent change was calculated by the 
equation: 1 00 * (modified sample signal - control sample signal) / con- 
trol sample signal. Relative change for these probes due to 1:1 DAPA 
substitution is also given as reference. 
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Effects of DAP substitution are similar, and additive, to those seen with an '"creased hybr.d.m.on tune^A f™^™ * 
intensities obtained with unmodified cRNA hybridized for 1 8 hours versus partially DAP-mod.fied cRNA for 1 8 hours (blue 
to point.) transposed on a comparison of intensities obtained with unmodified cRNA hybridized for I I hours versus 
unmodified cRNA hybridized for 42 hours (orange data points). (B) Comparison of intensities obtained with unmodified cRN A 
nTridized for 1 8 hours versus partially DAP-modified cRNA hybridized for 42 hours. (C) Plot showing ^ relative increases 
Respect to the 1 8 hour, unmodified control for increased hybridization time ( 1 8 hour => 42 hour) I : I DAP* substitution, 
and a combination of increased hybridization time/DAP substitution. The total number of probes was d.v.ded into three equally 
sized bins according to their signal intensities, with the bins representing low, medium and high expressers Relative percent 
increases for each probe were derived from the equation: 1 00 * (modified sample signal [time. DAP, or combination of both] - 
control sample signal) / control sample signal, and are shown with standard deviation error bars. 



nounced for the medium and high signals. As shown be- 
fore, the DAP-modified cRNA generated hybridization 
signals which were also increased relative to the 18 hour 
control hybridization. However, the increase seemed 
more pronounced for the low and medium signals. We 
conclude that both DAP modification and an increased 
hybridization time can affect the hybridization reaction 
but in very different ways. 

Because of the differential effects of increased hybridiza- 
tion time and DAP substitution on low and high express- 



ers, we wanted determine whether the increases seen of 
using both DAP modification and an extended hybridiza- 
tion time could be additive. We therefore hybridized un- 
modified (control) cRNA for 18 hours and a DAP- 
modified cRNA for 42 hours (Figure 3B). The median 
slide intensity increased by 110% +/■ 17% < which is a P" 
proximately double the increase seen with either the 1:1 
DAP:A or the longer hybridization time by themselves. 
When all of the approximately 9,000 probes are divided 
into three equally sized bins representing low, medium, 
and high expressers according to their respective signal in- 
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BSS5 discrim^on of DAP- or MeC-subs, t u t e d and 

multiple adjacent centrally locate Modified (triangles), 

accession numbers: (A) (X79067), (B) (AF067I39), (C) (Z83844), and (D) (NM.004323). 



tensities, the effects of increased hybridization time and 
DAP become very evident (Figure 3C). Figure 3C shows 
that the relative increases in signal intensities due to in- 
creased hybridization time are biased towards the medi- 
um and high expressers, with signals increasing an average 
of 48% and 84%, respectively, whereas increases due to 
DAP substitution are more pronounced for low and medi- 
um expressers (57% and 81% increase, respettively). To- 
gether, the two modifications of increased hybridization 
time and DAP substitution can be used in concert to am- 
plify signal intensities for the entire range of probe signals, 
and as figure 3C indicates, these boosts in signals are ap- 
proximately additive for the 42 hour, DAP-containing 
samples. 



We next determined the effect of modified NTPs on hy- 
bridization specificity (the ability to distinguish sequenc- 
es up to a certain homology). We addressed this issue by 
designing probes which had one to four adjacent, central- 
ly located mismatches and comparing hybridization sig- 
nals generated from these mismatched probes to signals 
generated from their corresponding perfect matches. In 
the control situation (hybridization of unmodified cR- 
NA), the hybridization intensity decreased as the number 
of mismatches increased, with two mismatches generally 
destabilizing the duplex sufficiently to reduce the hybnd- 
ization signal to 0% of the parent signal. There was one ex- 
ception (Figure 4C) where, even in the presence of four 
mismatches, the signal was not reduced below 60% of the 
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parent signal. This signal was low in the perfect match 
and, therefore, this situation most likely reflects a lower- 
ing to the noise levels. In fact, the signal was lowered to 
below threshold levels in the presence of mismatches. 
When equimolar ratios of DAP and unmodified ATP were 
incorporated into the cRNA target (1:1 DAP: A), specificity 
was not significantly affected in two of the four probe se- 
quences (Figure 4A and 4B). However, in one probe se- 
quence (Figure 4C), specificity was enhanced. For this test 
probe, the unmodified sample produced unusually high 
signals for base mismatches. Nevertheless, other groups 
have seen similar improvements in mismatch discrimina- 
tion with diaminopurine-containing oligo mers for differ- 
ent applications [13]. A fourth probe sequence 
demonstrated a smaller destabilizing effect of a single 
mismatch with the partially modified cRNA compared to 
the control (Figure 4D). In this sequence, the hybridiza- 
tion was reduced to the same extent with three mismatch- 
es using either the control or partially modified cRNA. 
Full substitution of adenosine by DAP showed varied ef- 
fects on specificity in the four different probe sequences. 
In two of the probe sequences, the fully modified cRNA 
behaved similarly to the control cRNA (Figures 4A and 
4B). In a third probe sequence, the specificity improved 



for the fully modified cRNA, as it did for the partially 
modified cRNA (Figure 4C). The improvement in specifi- 
city for this probe depended on the extent of DAP modifi- 
cation. In a fourth probe sequence, the fully modified 
cRNA showed a dramatic decrease in specificity with one- 
or two-base mismatches, although three mismatches re- 
duced the hybridization intensity to ~10% of that of the 
perfect match. The loss in specificity for this probe se- 
quence also depended on the extent of DAP modification, 
with one, two, or three mismatches required to reduce the 
hybridization intensity to ~ 10% for the control, partially 
modified, and fully modified, respectively, cRNA. A fifth 
test probe showed hybridization signal intensity below 
the negative control threshold in both the control and 
modified ATP samples, suggesting either an absence of the 
transcript from the target sample or abundance too low to 
quantify (data not shown). Repeat experiments with 
cRNA containing modified ATP supported the general 
specificity trend: unmodified ATP > 1 : 1 DAP:A > all DAP 
(data not shown). Attempts to increase specificity in the 
all DAP condition by raising the temperature during hy- 
bridization from 37°C to 42°C or 47°C were unsuccessful 
(data not shown). Others have been able to distinguish se- 
quences of up to 90% homology [6] using the same plat- 
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form (3-base mismatch/ 30-mer oligonucleotide signal 
near or below negative control threshold), and we support 
their findings here with the control and 1:1 DAP:A condi- 
tion. The fully MeC-substituted cRNA targets were exam- 
ined with respect to specificity in two test sequences 
(Figures 4B and 4D). In one test sequence (Figure B), the 
fully MeC-substituted cRNA targets behaved nearly identi- 
cally to the partially DAP-substituted targets. In the other 
test sequence (Figure 4D), the MeC-substituted cRNA be- 
haved nearly identically to the unmodified cRNA target 
and showed better specificity than that of the partially 
DAP-substituted cRNA target with respect to the effect of 
one mismatch. We conclude that the specificity of fully 
MeC-substituted cRNA and partially DAP-substituted 
cRNA targets are, for the most part, equivalent and essen- 
tially no loss in specificity (as determined by the effect of 
two mismatches) is observed with these modified targets. 

An additional metric of specificity on the Codelink bioar- 
rays is the use of 54 negative control bacterial probes spot- 
ted in 4X redundancy, which were designed and 
empirically shown not to cross hybridize to human tran- 
scripts [6]. While the all DAP condition resulted in an in- 
crease in the hybridization intensity for three of the 
negative control probes, the 1:1 DAP:A condition pro- 
duced lower background signals similar to those of the 
unmodified control (data not shown). Thus, increases in 
hybridization signal intensities in the 1:1 DAP: A condi- 
tion are attributable to specific modified cRNA target/ 
DNA probe interactions. We conclude that partial substi- 
tution of adenosine with DAP does not significantly com- 
promise specificity and was, therefore, investigated 
further. 

Effect of DAP incorporation on differential expression ra- 
tios 

Ultimately, the goal of any global expression profiling sys- 
tem measuring relative transcript abundance is to accu- 
rately and reproducibly determine the changes in 
expression levels between different target samples. We 
therefore determined if the increases in signal intensity 
caused by DAP incorporation had an effect on differential 
ratio calls. Kidney and lymphoma cRNA samples contain- 
ing either no modified NTPs or 1:1 DAP:A were hybrid- 
ized in duplicate to Human Uniset I arrays; average kidney 
to lymphoma ratios were calculated and plotted (Figure 
5) after removing signal outliers defined as a two-fold dif- 
ference between replicates. Differential expression ratios 
demonstrate very good correlation between the unmodi- 
fied control sample and the 1:1 DAP:A sample (r = 0.95) 
on the Human Uniset I microarray. 

In addition to investigating the correlation of the differen- 
tial expression ratios generated from modified and un- 
modified targets, we determined whether DAP 



incorporation affected the variability of the ratio calls. 
One method to measure microarray reproducibility is by 
calculating the coefficients of variation (CVs) for each rep- 
licate probe across arrays. Likewise, CVs can also be calcu- 
lated for multiple ratio calculations across different tissue 
samples. Figure 6 shows a plot of the CVs of the ratios as 
a function of the mean ratios for all of the data points, re- 
gardless of the intensity level of the probe. Consistent with 
earlier findings [6], the majority of the CVs in the of the 
kidney to lymphoma ratios are below 30%, with an aver- 
age of 13.2%, and the variability increases as the ratio ap- 
proaches unity for the unmodified targets (Figure 6A). 
Figure 6B shows that the 1:1 DAP:A condition produces 
ratios with similar overall variability (average CV of 
12.4%) compared to the unmodified control targets. 
When the CVs in the ratios generated using either the con- 
trol or partially modified cRNA are binned according to 
the magnitude of the CV, the partially modified cRNA 
generates more CVs in the less than 20% bins (Figure 6C). 
These data demonstrate that the low variability in ratio 
calls that we are able to routinely obtain [6] are main- 
tained during DAP incorporation. However, a closer in- 
spection of the ratios in Figures 6A and 6B reveals that a 
possible caveat of DAP use is a slight compression of ra- 
tios along the entire range of calls. This compression is 
also observed when the observed ratio is binned accord- 
ing to the magnitude of the fold change (Figure 6D). The 
partially modified cRNA generates more ratios in the one 
to 1 .5 fold change bin and fewer ratio changes in the other 
bins. This compression is not significant but also mani- 
fests itself in the slope of the correlation line in Figure 5. 
The slope of this line was found to be greater than unity. 

The differential expression ratios calculated when using 
DAP (Figures 5 and 6) suggest that incorporation of DAP 
has little effect on correlation and reproducibility of ratio 
calls between tissues. However, the ability to accurately 
discriminate between signals due to true hybridization 
events and what may be considered background noise is 
paramount to making accurate differential assessments. A 
lower limit of detection was previously defined using our 
platform by developing a negative control threshold in or- 
der to assign a confidence level to such signal or noise 
queries [6). Briefly, this threshold was determined by tak- 
ing the mean signal of bacterial negative control probes 
mentioned above (minus a 10% trim to account for weak 
cross hybridization of high expressers or for true hybridi- 
zation to sequences not in the database) and adding three 
standard deviations (99.7% confidence). Using only 
probes that were at or above the negative control thresh- 
old in both tissues, we observed a significantly greater 
number of probes in the 1:1 DAP:A sample for which we 
could confidently assign a ratio (Figure 7). As Figure 7 
shows, samples containing DAP generated over 400 more 
above-threshold ratios than the control samples contain- 
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Effect of partial DAP substitution on the variability and magnitude of differential expression ratios. (A and B) CVs of the ratios 
plotted as a function of the mean ratio for unmodified (A) and modified (B) samples. Signal outliers, defined as any pair of sig- 
nals which were greater than two-fold different in intensity, were removed from all samples. (C) Plot showing the number of 
probes having a designated CV across a range of CVs for control and partially DAP-modified cRNA targets. (D) Plot showing 
the number of probes having a designated fold change (ratio) across a range of ratios for control and partially DAP-modified 
cRNA targets. 



ing only unmodified OTPs. Most of these ratios identified 
only in the 1:1 DAP:A sample were small in magnitude, 
with fold changes between 1.1 and 1.9. 

Discussion 

The effect of modified NTPs on hybridization intensities 

Two important issues for identifying differentially ex- 
pressed genes using DNA microarrays are 1) methods to 
increase signal intensity and 2) the ability to accurately 
separate true hybridization signals from background 
noise at the low end of the signal dynamic range. In this 
study, we utilized modified nucleoside triphosphates with 
the aim of enhancing hybridization signal intensity across 
a wide range of probe sequences. Modified ribo- and de- 
oxyribonudeoside analogs have been used in a variety of 



applications [12-14], but their use has been limited in the 
field of microarrays [23]. We demonstrate that 2-ami- 
noadenosine or 2,6-diaminopurine (DAP), an analog of 
adenosine, significantly increases signal intensity for a 
wide range of probe sequences, whereas C-5 methylated 
pyrimidine analogs of cytosine and uridine decreased sig- 
nal intensities across the entire range of probe sequences 
on our platform (Figure 2). 

Although incorporation of DAP did not uniformly in- 
crease the hybridization stability for all target-probe du- 
plexes, even moderate incorporation of DAP (1:1 DAP:A) 
resulted in signal intensity increases (up to 30-fold) for 
the majority of probes on our platform. More important- 
ly, a significant number of probes that went previously 
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Number of above-threshold ratios generated by partially DAP-modified and unmodified target samples A negat.ve control 
threshold was imposed to define the lower limit of detection and was calculated by taking the 80% trimmed mean (top 1 0% and 
bottom 10% of signals removed from the population) of 216 negative control probes and adding three standard dev.at.ons to 
the mean (99.7% confidence). Embryonic kidney and Burkitt's lymphoma samples were performed in duplicate and data were 
screened for probes that displayed above-threshold signal intensities. Kidney to lymphoma ratio calls were calculated us.ng 
only above-threshold probe signals with a maximum of four ratios for each probe. The data represent the average of the 
number of calls for any two kidney and lymphoma slides +/- standard deviation error bars. 



undetected (i.e., below a negative control threshold; [6)) 
in the control sample exhibited signal intensities above 
this threshold in DAP-modified samples. Utilization of 
such a modification may prove especially valuable for low 
abundant transcripts which may be difficult to quantify 
because of weak hybridization kinetics. As shown in Fig- 
ures 2B and 2C, the largest increases in signal intensities 
occurred for low and medium expressers relative to the 
unmodified control sample. It is apparent that the effect 
of DAP depends not only on the sequence compositions 
of target and probe, but also on the absolute abundance 
of specific transcripts in the target sample. 

It is of interest to note that the results we report here differ 
from those obtained by others [23]. Those studies report- 
ed increases in signal intensity for targets containing 5- 
methyl UTP while completely substituted targets contain- 
ing DAP had the opposite or no effect. Although we can- 
not completely account for the disparity of the results, one 
key difference in methodology between the studies is that 
their group was not able to normalize concentrations for 
modified and unmodified ATP targets because of dramatic 
differences in cRNAyield. Therefore, mass input of cRNA 
target samples were much greater for unmodified samples 



compared to the DAP-containing samples. Because the 
sample preparation used in this protocol is highly robust 
and reproducible [24], we were able to use equivalent 
mass inputs of all modified and unmodified targets dur- 
ing hybridization, allowing us to present our results quan- 
titatively rather than qualitatively. However, we agree that 
complete substitution of DAP for adenosine may cause 
specificity to decline in certain sequence contexts, which 
may limit its usefulness in microarray applications. How- 
ever, we have found that reducing the proportion of DAP 
incorporated into target samples (e.g., incorporating a 1:1 
ratio of DAP: adenosine) is one approach to avoid such de- 
creases in specificity. Other factors that may contribute to 
the dissimilarity of the results include differences in probe 
sequence and size, microarray fabrication/chemistries, la- 
beling/detection methods, and the vigorous mixing that is 
employed in our microarray experiments. We have previ- 
ously shown the dramatic impact of mixing on hybridiza- 
tion signal intensities [6]. Presumably, this impact is due 
to the effect of three dimensional diffusion of the target 
molecules and to the effect on how many probes reach 
equilibrium in the hybridization. It is possible that the de- 
gree of mixing during the hybridization can affect the 
number of probes which show a difference in intensity 
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with substitution of DAP for adenosine as well as the mag- 
nitude of this difference. Given these differences, it is not 
surprising that our results are not in line with those previ- 
ously reported. 

The results in our study are in good agreement with previ- 
ous reports which include a wide variety of interaction 
types, including DNA:DNA, DNA:RNA, and DNA:PNA in- 
teractions. [11-13,25]. These studies have shown that 
DAP substitution in oligonucleotide probes can increase 
thermodynamic affinities as reported directly by increases 
in T m and indirectly through increases in hybridization in- 
tensities. 

Biophysical considerations in using modified NTPs 

Modified nucleotides appear to drive the equilibrium 
(particularly for low copy transcripts binding with their 
targets) towards duplex formation. These findings suggest 
that DAP use could be exploited in RNA samples that are 
limited, such as tumor biopsies. Based on our experiments 
with extended hybridization times (Figure 3), use of DAP 
may prove to be a useful alternative to increase array sen- 
sitivity for studies that are unable to generate enough tar- 
get to hybridize or are unable to increase hybridization 
times because of throughput constraints. 

Why do we see differential effects of DAP, MeC, and MeU 
on hybridization intensities? The fact that DAP was able to 
increase hybridization intensities while MeC and MeU de- 
creased hybridization intensities (Figures 2D and 2E) is 
interesting in light of the fact that DAP is a purine incor- 
porated into an RNA strand while MeC and MeU are pyri- 
midines incorporated into an RNA strand. Many studies 
have shown that a ribopurine rich strand bound to a de- 
oxypyrimidine rich strand has a higher stability than the 
corresponding all deoxy strands which, in turn, have a 
higher stability than a deoxypurine rich strand bound to a 
ribopyrimidine rich strand [26,27]. These differences also 
suggest that further studies examining DAP or MeC incor- 
poration into first strand cDNA, followed by formation of 
a DNA-DNA duplex, may not necessarily show the same 
effects as incorporation of these modified nucleotides into 
cRNA. Finally, earlier studies have noted that methylation 
of cytosines (generating MeC) in ribopolynucleotides sta- 
bilizes duplexes to the same extent as substitution of 
thymine (MeU) for uracil [28]. Thus, although it is still 
surprising that MeC and MeU decrease hybridization in- 
tensities, it is not surprising that the magnitude of the ef- 
fects generated by MeC and MeU residues are similar. 

Could the differential effects of DAP versus MeC or MeU 
or the differential effects of DAP on different probe se- 
quences also be related to structural consequences? Earlier 
studies have shown that DAP, in DNA fragments, can wid- 
en the minor groove, as detected by reactivity towards ura- 



nyl nitrate or susceptibility to Dnase cleavage [29], and 
can relieve compressions in the minor groove associated 
with A-tracts [30]. In contrast, NMR and molecular mod- 
eling studies of an oligonucleotide duplex containing 
MeC have demonstrated that, while this duplex still has a 
B-DNA conformation, there are differences in the structur- 
al parameters and thermal stability relative to the un- 
modified duplex [31]. An earlier study also found, in two 
closely related octanucleotide duplexes, that although the 
methylated duplexes retained their B-DNA conformation, 
different structural and thermal stability effects were seen 
[32]. Although these studies have been carried out with 
DNA-DNA and not DNA-RNA duplexes, it is possible that 
MeC substitution may have different structural conse- 
quences than DAP substitution. 

We have shown that while complete substitution of DAP 
for ATP may cause decreased base mismatch discrimina- 
tion in certain sequence contexts, equimolar ratios of ad- 
enosine and DAP in cRNA target samples did not have 
significant detrimental effects on specificity (Figure 4). Re- 
markably, in one sequence context, the DAP modification 
enhanced specificity against mismatches. Enhanced spe- 
cificity against mismatches has also been observed by oth- 
er groups when DAP is incorporated into PNA oligomers 
[13] and after C5-(l-propynyl)ation of pyrimidines in 
DNA-RNA duplexes [33]. 

The effect of modified NTPs on differential expression ra- 
tios 

The use of DAP to increase hybridization signals is sup- 
ported by the results that show equivalency of both ratio 
calls and ratio variability in 1:1 DAP:A and control sam- 
ples (Figures 5 and 6). When all data points are consid- 
ered in the differential expression ratio analyses, the two 
methods appear equivalent. However, when only those 
data points which are above a threshold value [6] are 
used, ensuring a higher confidence that these intensities 
represent true expression levels and not noise or weak 
cross hybridization, the DAP-modified cRNA enabled an 
~ 10% increase in the number of ratios that could be gen- 
erated (Figure 7). Such increases are critical in order to 
sample as many genes as possible in any given microarray 
experiment. For example, some investigators have ana- 
lyzed only data for which the signal intensity is greater 
than approximately 0.4% of the total signal range in both 
channels [34]. Although this method minimizes the vari- 
ance associated with the ratios, many genes and their ex- 
pression changes are missed. Other platforms use a 
minimal intensity level to determine whether a gene is 
present or absent, prior to including this gene in expres- 
sion analyses [35]. It is plausible that DAP modification, 
by increasing intensity levels, particularly for low ex- 
pressors, could enable wider coverage in genome-wide ex- 
pression analyses. 
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In summary, incorporation of modified nucleotides may 
not only lead to discovery and better quantitation of rare 
expressers in complex samples but also impact other as- 
pects of the microarray experiment such as probe design 
prior to array fabrication and the statistical design of 
microarray experiments. The target preparation procedure 
(generation of amplified cRNA or of first strand cDNA) 
and the subsequent duplex formation may dictate the 
choice of which modified nucleotide to use, as we and 
others have found differential effects with various nucle- 
otides. We believe further investigation in this area could 
be fruitful. 

Methods 

Target preparation 

Five ugs of human embryonic kidney or Burkitt's lympho- 
ma total RNA (Ambion, Inc., Austin, TX) were added to a 
reaction mix in a final volume of 12 ul, containing 0.5 
pmol T7-(dT) 24 oligonucleotide primer The mixture was 
incubated for 10 minutes at 70°C and chilled on ice. With 
the mixture remaining on ice, 4 ul of 5X first-strand buffer, 
2 ul 0.1 M DTT, 1 ul of 10 mM dNTP mix and 1 ul Super- 
script™ II RNaseH" reverse transcriptase (200 U/ul) was 
added to make a final volume of 20 ul, and the mixture in- 
cubated for one hour in a 42°C water bath. Second-strand 
cDNA was synthesized in a final volume of 150 ul, in a 
mixture containing 30 ul of 5X second-strand buffer, 3 ul 
of 10 mM dNTP mix, 4 ul of E. coli DNA polymerase 1(10 
U/ul) and 1 ul of RNase H (2 U/ul) for 2 hours at 16°C. 
The cDNA was purified using a Qiagen QIAquick purifica- 
tion kit, dried down, and resuspended in IVT reaction mix, 
containing 3.0 ul nuclease-free water, 4.0 ul 10X reaction 
buffer, 4.0 ul 75 mM ATP, 4.0 ul 75 mM GTP 3.0 ul 75 
mM CTP, 3.0 ul 75 mM UTP, 7.5 ul 10 mM Biotin-11- 
CTP, 7.5 ul 10 mM Biotin 1 1-UTP and 4.0 ul enzyme mix 
(unmodified control condition). Commercially available 
2-aminoadenosine-5'-triphosphate / 5-methylcytidine-5*- 
triphosphate, and 5-methyluridine-5 '-triphosphate 
(TriLink Biotechnologies, Inc., San Diego, CA) were sub- 
stituted for ATP, CTP, and UTP, respectively, in separate 
reactions containing either complete substitution, 1:1, or 
1:3 ratio of modified: unmodified NTP, keeping molar in- 
put of nucleotide constant. The reaction mix was incubat- 
ed for 14 hours at 37°C and cRNA target purified using an 
RNeasy* Kit (Qiagen). cRNA yield was quantitated by 
measuring the UV absorbance at 260 nm, and fragmented 
in 40 mMTris-acetate (TrisOAc), pH 7.9, 100 mM KOAc, 
and 31.5 mM MgOAc, at 94°C, for 20 minutes. This typi- 
cally resulted in fragmented target with a size range be- 
tween 100-200 bases. 

Array hybridization 

Two ug of fragmented target cRNA was used for hybridiza- 
tion of each UniSet Human I Expression Bioarray (Mo- 
torola Life Sciences) containing 9589 probes 



(representing 9,203 unique accession numbers (genes), 
corresponding to approximately 8,935 unique clusters 
and 386 control probes, selected initially from GenBank 
Unigene build #125) or for hybridization to a microarray 
containing probes corresponding to 1100 human genes, 
each spotted 6 times per array. All probes on these micro- 
arrays are 30-mer oligonucleotides spotted by piezoelec- 
tric technologies and covalently attached to a polymeric 
matrix [6]. These microarrays were hybridized, washed, 
and processed using a direct detection method of the bi- 
otin-containing transcripts by a Streptavidin-Alexa647 
conjugate as previously described [6]. Processed slides 
were scanned using an Axon GenePix Scanner with the la- 
ser set to 635 nm, a PMT voltage of 600, and a scan reso- 
lution of 10 microns. 

Data anaiysis 

Slides were scanned and images for each slide were quan- 
titated using CodeLink Scanning and Analysis Software 
(Motorola Life Sciences). Signal intensities for each spot 
were calculated by summation of the pixel intensities for 
each spot, followed by local background subtraction 
(based on the median pixel intensity of the area surround- 
ing each spot). Whole array data normalization, when 
used, was performed independently for each slide by di- 
viding each spot's intensity (after background subtrac- 
tion) by the median signal intensity of all test probes. All 
false positives, determined by visual inspection of the im- 
ages, which were greater than 2-fold different between du- 
plicate arrays were removed. 

Digestion and chromatography of cRN A 

Four units of PI nuclease were used to digest 20-50 ug of 
cRNA to generate nucleotide monophosphates. The en- 
zyme was incubated with the cRNA at 55°C for 6 hours, 
then for 6 hours at 37°C in the presence of 10 units of calf 
intestine alkaline phosphatase to generate the nucleo- 
sides. The digested products were purified using Microcon 
YM-3 columns followed by centrifugation at 8000 g for 
30-60 minutes. The mix was then concentrated using a 
SpeedVac to 100 ul. This solution was analyzed on an 
HPLC column equilibrated with 0.03 M TEAA (Solvent A) 
at a flow rate of 1 ml/min. Hie following gradient was 
used: 0-1% Solvent B (95% AcCN, 5% Solvent Al ) over 5 
minutes, 1-15% Solvent B over the next 15 minutes, 15- 
45% Solvent B over the next 30 minutes, 45-100% Sol- 
vent B over the next 20 minutes, and hold at 100% B for 
2 minutes. The concentration of the heterocycles was de- 
termined by the absorbance values at 260 nm (the wave- 
length where maximal absorption occurs for the 
heterocycles) and the biotin-containing nucleoside con- 
centrations were determined by the absorbance values at 
294 nm (the wavelength where maximal absorption oc- 
curs for biotinylated cytosine and biotinylated uridine). 
The biotin-1 1-UTP peak was measured at an absorbance 



Page 13 of 15 
(page number not for citation purposes) 



BMC Biotechnology 2002, 2 



http://www.biomedcentral.eom/1472-6750/2/14 



of 294 nm with the extinction coefficient = 13000 M^cnv 
The 2-amino ATP was measured at an absorbance of 
260 nm with the extinction coefficient = 9894 M^cnr 1 . 
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ABSTRACT 

The possibility of equalizing DNA duplex stability is 
essential for the application of sequencing by hybrid- 
ization. In this paper we describe a new strategy to 
obtain DNA duplexes with a thermal stability indepen- 
dent of their base content. Modified *C bases have 
been developed and incorporated into oligonucleo- 
tides. The influence of these modifications on duplex 
stability has been studied by absorption spectroscopy, 
thus allowing selection of W-4-ethyl-2'-deoxycytidine 
(d 4Et C), which hybridizes specifically with natural dG 
to give a G 4Et C base pair whose stability is very close 
to that of natural AT base pairs. Duplexes built with AT 
and/or G 4Et C base pairs exhibit thermal stabilities 
independent of their base content in a classical buffer 
solution, thus enabling control of the stability of DNA 
hybrids as a function of their length only. 

INTRODUCTION 

Specific duplex formation between an oligonucleotide and a 
DNA sequence is the foundation of nucleic acid analysis and 
enzymatic labeling of DNA fragments. The success of these 
techniques implies absolute discrimination of perfect hybrids 
from ones containing mismatches to avoid false positive results 
and the design of hybrids having equivalent stabilities, which 
leads to homogeneous results. 

The recently reported reverse hybridization technique is based 
on detection of perfect hybrids formed between one or several 
labeled DNA fragments with every oligonucleotide of a given 
length in a complete set immobilized as a two-dimensional matrix 
(1-3). This method is based on the possibility of discriminating 
perfect hybrids from those containing mismatches. This can be 
performed by increasing the temperature to allow dissociation of 
hybrids with mismatches before those without mismatches. 
Reverse hybridization could be a method for the design of a fast 
analysis technique for large DNA sequences. However, the 
approaches described in the literature using natural oligonucleotides 
serving as probes have a serious drawback, due to the base 
composition dependence of duplex stability. It is well known that 



a GC base pair with three hydrogen bonds is more stable than AT 
or AU base pairs, which have only two hydrogen bonds. A perfect 
hybrid built with AT-rich sequences would therefore have a 
similar or even lower stability than do hybrids built with GC-rich 
sequences involving one mismatch. This leads to false positive or 
false negative signals depending on the hybridization temperature 
and washing conditions. 

Several techniques have been studied to alleviate this problem 
but none of them have been successfully completed. Hybridization 
studies using TMAC1 to reduce differential DNA duplex stability 
according to the base composition have been described and 
developed by several groups (4-8). However, this process 
requires multimolar concentrations of TMAC1, which is very 
viscous, leading to manipulation difficulties, and is not adapted 
to biochemical procedures involving enzymes, such as random 
priming or LCR, which are carried out under weak ionic 
concentrations (9). The proposed concentration variation of each 
particular oligonucleotide (3) or changing of the oligonucleotide 
length as a function of its base composition would be technically 
complex and impractical when applied to a large number of 
oligonucleotide sequences. Another option, the use of modified 
bases such as 5-CldU and 2-NH2dA (10), has been examined, but 
with a great variation in stability. To reduce the stability of 
GC-rich duplexes, oligonucleotides were built with ribo- and 
deoxyribonucleotides, but the amount of hybrid obtained greatly 
decreased as the number of ribo<-»deoxyribo transitions increased. 
Moreover, hybrid thermal dissociation spread over a wide 
temperature range, making it difficult to discriminate between 
perfect hybrids and those containing a mismatch (11). 

In this paper we describe a new strategy for obtaining DNA 
duplexes whose thermal stabilities in NaCl solution are independent 
of their base content Our approach consists of modification of 
one of the four natural deoxynucleosides which forms, with the 
complementary nucleoside, a base pair whose stability is very 
close to that of the other base pair. To achieve this end, we chose 
to modify 2'-deoxycytidine (d*C), which hybridizes specifically 
with natural 2'-deoxyguanosine (dG) to give a G*C base pair 
having a stability very similar to that of the AT base pair. We 
describe in this paper the preparation of modified oligonucleotides 
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and selection of d 4Et C, which gives a G 4Et C base pair having a 
stability similar to that of the AT base pair. 

MATERIALS AND METHODS 

All reagents and solvents were of reagent grade quality and used 
without further purificatioa Cytosine arabinoside phosphoramidite 
was obtained from Glen Research. Unmodified oligonucleotides 
were from Appligene-Oncor. Absorption spectra were recorded 
on a UVIKON 860 (Kontron). Absorption studies were carried 
out on a UVIKON 941 cell changer spectrophotometer (Kontron). 
Analytical TLC was carried out on Merck 5554 Kieselgel 60F 
254 plates and eluted with various eluents: system A, 
CH 2 Cl 2 /MeOH (95:5 v/v); system B, CH 2 Cl 2 /MeOH (90:10 v/v); 
system C, CH 2 Cl 2 /AcOEt/Et 3 N, (45:45:10 v/v/v). Merck 9385 
Kieselgel 60 was used for column chromatography. All 4,4-di- 
methoxytrityl-containing substances were identified as orange 
colored spots on TLC plates by spraying with 1 0% perchloric acid 
solution. HPLC was performed on a Waters 626 E (system 
controller) equipped with a Waters 996 photodiode array detector. 
Analysis and purification by ion exchange chromatography were 
performed with a FPLC apparatus (Pharmacia). NMR experiments 
were carried out on a Bruker AM 300 WB spectrometer. 

Synthesis of modified nucleosides 1 

5' -0-Dimethoxytrityl-N-4-ethyl-2' -deoxycytidine lb, 1 ,2,4-Triazole 
(0.047 mol, 3.1 g) was suspended in anhydrous CH3CN (50 ml) 
at 0°C followed by addition of POCI3 (0.01 mol, 0.98 ml) with 
rapid stirring. Triethylamine (0.052 mol, 7.3 ml) was then added 
dropwise to the slurry stirred at 0°C for an additional 30 min. 
S'-O-Dimemoxytrityl-S'-O^t-buryldimethylsilyO^'-deoxyuri- 
dine (0.03 1 mol, 2 g), obtained as described in the literature (1 2), 
dissolved in 6 ml CH3CN was added dropwise at 0°C. The 
ice/water bath was removed and the mixure allowed to react with 
magnetic stirring for 4 h at room temperature. An ethylamine 
solution (0.15 mol, 16 M in CH3CN) was added directly to the 
crude derivative triazolyl obtained above. The reaction was 
monitored by TLC analysis. The reaction was complete after 2 h. 
The solution was concentrated under reduced pressure, solubilized 
in dichloromethane and washed with 5% sodium hydrogen 
carbonate solution. After being dried over Na 2 S04 and concentrated 
to dryness, the residue was purified on a silica gel column using 
CH 2 Cl 2 /MeOH/Et 3 N (98:1:1 to 97:2:1 v/v/v) as eluent. Yield 
8 1 % (1 .7 g), Rf system A 0.26, Rf system B 0.7. 

The syntheses of 5'^-Klimemoxytrityl-3'-0-(t-butylQ^ethyl- 
silylJ-A^methyl^'-deoxycytidine, 5'-0-dirnethoxytrityl-3'-oV 
butyld^emylsilyl>7V^propyl-2'-deoxycytidine, 5'-0-dimethoxy- 
trityl-3 '-0-(t-buryltoemylsilyl)-AM-aUyl-2 / Hleoxycytidine and 
S'-O-oWmoxymtyl-S'-O^t-butyld^ 

deoxycytidine were carried out as described above by replacing 
ethylamine by methylamine, propylamine, allylamine and propargy- 
lamine respectively (13). 

5'-0-Dimefooxytrityl-3'-O^ 
deoxycytidine obtained above (0.0025 mol, 1 .7 g) was treated 
with a 1 M tetrabutylammonium fluoride solution (0.005 mol, 5 ml) 
in THF at room temperature. The reaction, monitored by TLC, 
was complete after 2 h. The reaction mixture was concentrated 
under reduced pressure and the residue dissolved in CH 2 C1 2 and 
washed with 5% sodium hydrogen carbonate solution. After 
being dried over Na 2 S04 and concentrated to dryness the 



obtained residue was purified on a column of silica gel using 
CH 2 Cl 2 /Me0H (98:2 to 95:5 v/v v/v) as eluent. Yield lb 78% 
(LI g)> *nb svstem B °- 55 * lH (DMSO): 8 1.1 (t, 3H, 

-NH-CH 2 -CH 3 ), 2 (m, 1H, H r ). 2.2 (m, 1H, H 2 <), 3.2 (m, 2H, 
-NH-CH 2 ), 3.2-3.3 (m, 2H, H 5 ', H 5 "), 3.7 (s, 6H, -O-CH3), 3.9 
(m, 1H, H4'), 4.3 (m, 1H, Hy), 5.1 (d, 1H, H 5 ), 6.2 (m, 1H, H r ), 
6.8-7.3 (m, 18HPh), 7.6 (d, lH,H 6 ),7.6(t, 1H,NH), s: singlet, 
d: doublet, t: triplet, m: multiplet, Ph: phenyl. 

Compounds la, flc, Id and Jk were obtained as described for 
compound lb starting from 5'-OKiimemoxytrityl-3'-0-(t-butyl- 
dimemylsilyl)-A^methyl-2'-deoxycytidine, 5'-0-dimethoxytrityl- 
1 -3'-0^t-butyldimemylsilyl)-AM-propyl-2'-deoxyc^dine, S'-O- 
cumemoxytrityl-3'-0^t-butyldimem^ 
dine and 5'-^-«iimemoxytrityl-3'-^^ 

pargyl-2'-deoxycytidine respectively. Yield la 78%, R nn 0.53; yield 
li 84%, R^ 0.59; yield Id 90%, R M 0.63; yield l£ 72%, R^ 0.64 
system B. 

Synthesis of phosphoramidite derivatives 2 

5 f -0'Dimethoxytrityl-3'-0-(2<yanoethyl~N,N-diisoprvpylamido- 
phosphiie)~N-4-ethyl-2'-deoxycytidine 2b Compound lb was dried 
by several co-evaporations and left in a desiccator overnight. 
2-CyanoemylW//-o^isopropylarnidochlorophosphite (0.53 mmol, 
0.12 ml) was added dropwise under argon atmosphere to a 
magnetically stirred mixture of compound lb (0.35 mmol, 200 mg) 
and diisopropylethylamine (1.43 mmol, 0.24 ml) in anhydrous 
dichloroethane (4 ml) at room temperature. The phosphitylation 
reaction was monitored by TLC analysis. After 1 h the reaction 
mixture was diluted with ethylacetate and washed with 10% 
sodium hydrogen carbonate solution, then with saturated sodium 
chloride solution. The organic solution was dried over sodium 
sulfate and concentrated under reduced pressure. The residue was 
purified on a silica gel column using CH 2 Cl 2 /AcOEt/Et3N 
(60:30: 10 v/v/v) as eluent The collected fractions containing the 
phosphoramidite compound 2b were pooled and concentrated 
under reduced pressure. Compound 2b was then obtained as a 
white powder after precipitation in cold hexane (-70 °C). This 
compound was isolated by quick filtration and then dried in a 
desiccator. Yield 2b 48% (m = 1 30 mg), R^ system A 0.35 and 
0.33, R^ system C 0.68 and 0.61. 

The phosphoramidites 2a, 2c, 2d and 2e were obtained starting 
from modified nucleosides la, 1c, Id and le respectively, 
following the procedure described for preparation of 2b. Yield 2a 
46%, R^ system A 0.36 and 0.32, R^ system C 0.52 and 0.43; 
yield 2£ 47%, R^ system A 0.39 and 0.34, R^ system C 0.73 and 
0.68; yield 2d 42%, R^ system D 0.35 and 0.30, R^ system C 
0.72 and 0.67; yield 2& 49%, R^ system A 0.35 and 0.31, R^ 
system C 0.75 and 0.69. 

Synthesis purification and characterization of modified 
oligonucleotides 

Chain assembly was carried out on a Pharmacia Gene Assembler 
on solid support CPG (controlled pore glass) functionalized with 
a nucleoside using phosphoramidite chemistry (14). Syntheses 
were performed on a 1 urnol scale using 10 jimol commercial 
phosphoramidite or modified phosphoramidite, prepared as 
previously described, per cycle with a cycle time of 10 min and 
a coupling time of 1 .5 min for the commercial phosphoramidite. 
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The coupling time of the modified phosphoramidite was increased 
to 2.5 miit The coupling yields for modified dC phosphoramidites 
were as good as for the natural ones. The oligonucleotides obtained 
were then deblocked by treatment with concentrated ammonia 
overnight at 60°C. After deprotection and extraction of the 
organic impurities, the crude reaction products were purified by 
ion exchange chromatography using a Pharmacia FPLC system 
equipped with a DEAE column (8 uM, 100 x 10 mm) from 
Waters. A 25 mM Tris-HCl buffer, pH 8, in acetonitrile/water 
(1 0:90 v/v) with a linear gradient of NaCl from 0 to 0.45 M over 
40 min at 1 ml/min was used as eluent. The fractions were 
monitored by absorption at 254 nm. R t ~ 25 min for all compounds. 

After desalting, the purity of all oligonucleotides described was 
checked by reverse phase analysis using a Lichrocart system ( 1 25 
x 4 mm) packed with 5 urn Lichrospher RP 1 8 from Merck with 
a linear gradient of acetonitrile from 5 to 20% for 20 min in 0.1 M 
aqueous triethylammonium acetate buffer, pH 7, with a flow rate 
of 1 ml/min. The retention times (Rt) of oligonucleotides 5'-d(CG- 
AYGACGA)-3' involving one modified C at position 4 were as 
follows: Y = 4Me C, R t 1 2. 1 min; Y = 4Et C, R x 1 2.2 min; Y - 4Pr C, 
R t 13 min; Y = 4a "y1c, R t 12.8 min; Y = ^opargylQ R% 12 .g min. 

Full deprotection and nucleoside composition of the modified 
oligonucleotides were ascertained by nuclease degradation. An 
aliquot of oligonucleotide was digested with snake venom 
phosphodiesterase (Pharmacia Biotech) and alkaline phosphatase 
(Boehringer) in 0.1 M Tris-HCl, pH 8.2, for 19 h at room 
temperature. After inactivation of the enzyme at 90 °C for 2 min, 
the digestion products were analyzed by reverse phase chromato- 
graphy using a Lichrocart system (125 x 4 mm) packed with 
Nucleosil 1 00-5 C 1 8 from Macheray Nagel equilibrated with 0. 1 M 
aqueous triethylammonium acetate buffer, pH 7. The column was 
eluted at a flow rate of 1 ml/min with 0. 1 M aqueous triethylammo- 
nium acetate buffer, pH 7, for 15 min and then with a linear 
gradient of 0-20% acetonitrile in 0.1 M aqueous triethylammonium 
acetate buffer, pH 7, for 40 min. Detection was performed at 260 nm. 
All oligonucleotides were totally degraded to nucleosides. In each 
case four peaks were obtained. Comparison with natural and 
modified nucleoside samples allowed us to identify the different 
peaks. Three of them whose retention times were identical for 
every oligonucleotide hydrolyzed correspond to dC (Rt 3.5 min), 
dl (R x 9. lmin) and dG (R t 1 1.5 min) respectively. The presence of 
dl, resulting from deamination of dA, could be due to contamination 
of the nucleases by adenosine deaminase. Peaks corresponding to 
the modified dC nucleosides were eluted with the following 
retention times: d 4Me C, R t 7.1 min; d 4Et C, R t 14.9 min; d 4Pr C, R t 
25.9 min; d 4al »y l C, R t 19.8 min; d^P^'C, R x \2.6 min. 

Melting experiments 

Changes in absorbance with temperature of 2 \\M duplexes in 
1 0~ 2 M sodium cacodylate buffer, pH 7, containing 1 M NaCl and 
2 x 1 (T 4 M EDTA were measured at X 2 60 nm in a UVIKON 94 1 
cell changer spectrophotometer equipped with a Huber PD 415 
temperature programer connected to a cryothermostat ministat 
circulating water bath (Huber). Samples and references were 
slowly heated at a rate of 0.5°C/min from 0 to 80°C. Melting 
temperatures (7*™) were taken as the temperature corresponding 
to half-dissociation of the complexes. The T m values were 
determined using the first and second derivatives. The molar 
extinction coefficients of the sequences were determined as 
described in the literature (15). 
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RESULTS AND DISCUSSION 

Design and synthesis of modified oligonucleotides 

Several approaches can be used to abolish differential binding 
stability of the hybrids dependent on base composition, among 
which is modification of one of the four natural 2'-deoxynucleo- 
sides, which then forms, with the complementary nucleoside, a 
base pair showing a stability very close to that of other base pairs. 
We can also modify both 2'-deoxynucleosides in a base pair to 
obtain a stability similar to that of the other natural base pairs. 
Another approach is based on modification of two non-comple- 
mentary 2'-deoxynucleosides which hybridize with the complemen- 
tary nucleoside to give base pairs with similar stabilities. 

These investigations allow the design of one modified 2'-deoxy- 
cytidine, d*C, which hybridizes specifically with natural dG to 
give a G*C base pair whose stability is very similar to that of an 
AT base pair. This choice was dictated by the following criteria: 
it is easier to find a modified GC base pair whose stability is 
similar to that of an AT natural base pair than to design a modified 
AT base pair whose stability is close to that of a GC natural base 
pair; preparation of oligonucleotides containing dC analogs is 
simpler than that of oligonucleotides built with dG analogs; 
modification of only one base pair rather than both simplifies the 
enzymatic preparation of DNA containing one or several modified 
nucleosides. 

We have chosen as dC analogs the araC and dC derivatives, in 
which one hydrogen of the exocyclic amino group at position 4 
is substituted by an alkyl group such as methyl, ethyl, /i-propyl, 
alryl or propargyl groups. In fact, it is well known that replacement 
of one dC by ^ (16) or d 4Me C (17) in an oligonucleotide 
induces a decrease in thermal stability of the hybrid formed by this 
oligonucleotide and the complementary nucleic acid sequence. 

Synthesis of the N-4 substituted 2'-deoxycytidine 

The synthesis of 5'-0-dimemoxytrityl-3 / -0-(2-cyanoethyl-^- 
diisopropylamidophosphite)-^ substituted-2'-deoxycyudine Qa, 
2b. ic_, M and 2fi) was carried out from commercial deoxyuridine 
as described in Figure 1 . It consists first of protection of the 5'- 
and 3'-hydroxyl functions of deoxyuridine by dimethoxytrityl 
and t-butyldimethylsilyl groups respectively (12). Conversion of 
2 / -deoxyuridine to N-4 substituted-2'-deoxycytidine was realized 
by activation at the C4 position of the protected 2'-deoxyuridine 
by treatment with phosphorus oxychloride in the presence of 
1 ,2,4-triazole followed by treatment with the primary amines 
(13). After deprotection of the 3'-hydroxyl function by the action 
of tetrabutylammonium fluoride, phosphitylation at the 3 '-position 
was achieved using 2-cyanoethyl-/^/v r Hliisopropylarriidochloro- 
phosphite. 

Hybridization properties of oligonucleotides containing 
a modified d*C 

Preparation of duplexes in which every natural dC is replaced by 
a modified d*C requires a great deal of work. Therefore, studies 
were first carried out with 9 bp duplexes composed of a triplet 
repeated three times involving an AT, GC or G*C base pair at 
position 4 (Table 1). We chose to investigate the stabilities of 
duplexes composed of a tridecamer and a nonamer to mimic 
hybrids formed between an oligonucleotide probe and a longer 
nucleic acid sequence. In fact, it is well known that the presence 
of dangling arms at both the 3'- and 5'-positions of the 
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Modified oligonucleotides 

DMT * dimcthoxytrityt; EtCN - 2-cyanocthyl; ^> isopropyl; 

(0 2-cyanocthyl-N,N diisopropyjamidochlorophosphite, diisopropyl- 
ethylamine; (it) assembly of oligonucleotides; (Hi) deprotection; (rv) 
purification. 

Figure 1. Synthesis of modified oligonucleotides involving M4-substituted- 
2'-deoxycytidine. 



oligonucleotide leads to good stabilization of a duplex which 
varies with its length (1 8). The results reported in Table 1 led to 
the following observations. 

For duplexes 2-2 involving A^-substituted dC the 7^ 
variation, measured as described in Materials and Methods, is 
inversely proportional to the carbon atom number of the alkyl 
group (Me, Et, w-Pr). The T m decrease is ~2°C when one 
methylene group is added (Fig. 2). Duplexes £ and 2, involving 
an allyl or propargyl group respectively, led to a 7^ slightly higher 
than that of duplex 5, involving an n-propyl group. This 
difference could be due to a steric effect or the higher positive 
inductive effect of the /i-propyl group versus that of the allyl or 
propargyl group. Note that the difference in T m is small and 
within the margin of error range, which is ~±1 °C. The decrease 
in thermal stability of duplex 2, involving one G 4Mc C base pair, 
compared with that of duplex 2, possessing a natural GC base pair 
at the same position, is small (AT m ~2.5°C). Duplexes 2 and 4, 
involving a G 4Me C and a G 4Et C respectively, are thermally more 
stable than duplex L having an AT base pair at the same position 
(AT m 3 and 1 °C respectively). 

Concerning double-stranded DNA involving natural nucleosides, 
their stabilities are highly dependent on base composition. 
Therefore, replacement of one AT base pair at position 4 of hybrid 
1 ( ^mi 46 ° c ) b v one GC base P air led to a 5 ° C increase in 7^ ( 7^ 
51°C), whereas replacement of this same AT base pair by one 
G*C base pair (particularly *C = d 4all y l C and d^P^C) led to 
hybrids having a stability very similar to that of duplex 1. 

Specificity of the G*C base pair 

To verify the recognition specificity of dG by d*C, thermal 
denaturation studies were carried out with duplexes involving the 
mismatches XC and X*C (X = T, C or A; *C « 4Me C , 4Et C, 4Pr C, 
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carbon atom number 

Figure 2. T m variations of 

5'-d (TTTCGTCGTCGTT) -3 ' 
3'-d(AGCAG*CAGC)-5' 
*C (CH2)n c as a function of the carbon atom number (n) of the substituent. 



Table 1. Melting temperatures at X26O nm of natural 
duplexes and those involving a modified *C at position 4 



y-d (TTTCGTCXTCGTT)3' 
3'-d (AGCAGYAGC) 5' 
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T m values were determined at an oligomer strand concentra- 
tion of 2 fiM in 1 0~ 2 M sodium cacodylate buffer, pH 7, con- 
taining 1 M NaCI and 2 x 1(H M EDTA. 

4aIlyl Cj 4propargyl c or ara^) ( Ta t,i e 2). The results obtained showed 
the following. 

Specificity of G*C base pair formation was maintained The 
presence of XC and X*C mismatches at position 4 led, in all 
cases, to a decrease of >20°C in T m value. 

The order of stabilities for duplexes involving XC and X*C 
mismatches was the same: 
T m (GC) » T m (TC) > T m (AC) > 7p (CC); 
T m (G 4R C) » r m (T 41 ^) > T n (A"RC) > T m (C 4R C), R = Me, 
Et, Pr, allyl or propargyl; 

T m (C^C) » T m CP"C) > T m (A 8 ^) > T m (C"Q. 
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Table 2. Melting temperature values (°C) at nm 
of perfect duplexes and ones containing a mismatch 
at position 4 



5'-d(TTTCGTCXTCGTT)3' 
3'-d (AGCAGYAGC) 5' 
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T m values were determined on a sample concentra- 
tion of 2 uM in \Qr 2 M sodium cacodylate buffer, 
pH 7, containing 1 M NaCl and 2 x 1(H M EDTA. 

The presence of CC or C*C mismatches has a very destabilizing 
effect. ara C seems to give a better discrimination than do C and 
4R C (Table 2). 

Choice and hybridization properties of duplexes 
involving AT and/or G 4Et C base pairs 

Although the previously obtained results for 7^ values of 
duplexes 6 and 2 involving d 4 ^C ( T m6 _ 45 . 5 ° C) and tfw«&C 
(T m7 45.5°C) respectively were closest to that of duplex 1, 
involving an AT base pair (7^ 46°C), we preferred to continue 
our studies with d 4Et C, which forms a slightly more stable G 4Et C 
base pair ( 7^ 47 °C) with G than it does with the AT base pair 
(r ml 46°C). " 

This modification was chosen in order to obtain modified 
duplexes with a not too low thermal transition and to minimize the 
possible steric effect of the alkyl group when duplexes were built 
with several contiguous d*C. We also decided to eliminate 4Mc C 
and the *™C because the first leads to a clearly more stable G 4Me C 
base pair (7^ 49°C) than does the AT base pair (T mi 46°C), 
whereas the triphosphate of the latter could not be used by 
polymerases (19) for the preparation of a modified DNA 
fragment, in contrast to N4-substitued dC derivatives, which are 
accepted by DNA polymerase (20). Since 4Me C is present in 
DNA of certain thermophilic bacteria (2 1 ), we can expect that the 
4El C derivative has the same physicochemical and biochemical 
properties. Therefore, studies were carried out with duplexes 
composed of 9 bp involving various base pairs ranging from nine 
AT base pairs to nine GC or nine G 4Et C base pairs (Table 3). 

To determine the effect of replacing an AT base pair with a GC 
or G 4Et C base pair, studies were carried out on duplexes 2-1L 
having the same transition of purine^pyrimidine (duplexes Ifi 
and 11 correspond to duplex 2 in which the AT base pairs at 
positions 2, 5 and 8 have been replaced with the GC and G 4Et C 
base pairs respectively), and on duplexes 2, 12, 12 and 14, with 
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T m values were determined at an oligomer strand concentration of 2 \iM in 
10 -2 M sodium cacodylate buffer, pH 7, containing 1 M NaCl and 2 x KT 4 M 
EDTA. 



different sequences. The results reported in Table 3 led us to the 
following observations. 

Duplexes 11, 12 and 14, involving three, six and nine G 4Et C 
base pairs respectively, have 7^ values ( T mll 27°C, T mll 24°C, 
T ml£ 23°C) very close to that of duplex 2 ( T ml 20°C), composed 
of nine AT base pairs. These results show that thermal stability of 
these duplexes does not depend on their base composition, 
contrary to what was observed with natural duplexes whose T m 
values greatly increase with GC base pair number (Figure 3). 

As we have previously shown, G 4Et C base pair formation is 
specific and, in fact, the T 4 ^ mismatch (duplex Ha), C 4 ^ 
mismatch (duplex life) and A 4Et C mismatch (duplex He) led to 
very unstable hybrids, making it impossible to determine their 7^ 
values (r m <10°C). 

The T m of duplex 2 ( 7^ 20 °C), involving nine AT base pairs, 
is lower than that of the natural duplex having five GC base pairs, 
three AT base pairs and one TC mismatch (7^ 28.5°C), CC 
mismatch ( 7^ 21 °C) or AC mismatch ( 7^ 27 °C); under the 
same conditions the T m of duplex 2 is higher than that of modified 
duplexes having five G 4Et C base pairs, three AT base pairs and 
one T 4 ^ mismatch (T mtj£ <10°C), C 4 ^ mismatch (T mm 
<10°C) or A 4Et C mismatch (r mllc <10°C). These results show 
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that under the hybridization conditions used it is not possible to 
discriminate between perfect hybrids built with AT-rich sequences 
and those built with GC-rich sequences involving a mismatch. 
This major drawback of the natural double-stranded DNA can be 
alleviated by using duplexes built with AT and/or G 4Et C base pairs. 

The thermal stabilities of modified duplexes 11, 12 and 14 are 
slightly higher than that of duplex 2, having only AT base pairs. 
This is in accordance with previously obtained results. We note 
that the difference between the more stable duplex 11 ( T mll 21° C) 
and duplex 2 ( 20°C) possessing nine AT base pairs is ~7°C. 
On the other hand, under the same conditions the difference in 
stability between natural duplexes involving nine GC (duplex 11, 
T „ 59°C) and nine AT base pairs (duplex 2, T m9 20°C) is 
-39°C. 

The T m of duplex 14 ( 7^ 23 °C), containing nine G 4Et C base 
pairs with three 4Et C 4Et C doublets, was as expected, similar to 
that of duplexes containing nine AT base pairs ( 7 T m ,20 o C), six AT 
and three G 4Et C (7^ 27°C) or three AT and six G 4Et C (T mU 
24°C) respectively, where the d 4Et C residues were not consecutive 
(Table 3 and Fig. 3). Consequently, substitution at position 4 of 
the C by an ethyl group does not lead to notable steric disturbance. 
These results were confirmed by the high cooperativity of 
dissociation of modified duplexes 12 and 14 into single strands, 
which was similar to that of natural duplexes 2 and H (Fig. 4). 

For duplexes 2, 12 and 14, involving nine AT base pairs, three 
AT base pairs and six G 4Et C base pairs and nine G 4Et C base pairs 
respectively, the increase in absorbance recorded at nm a * ^ e 
time of dissociation into single strands decreased when G *C 
base pair number increased. The same phenomenon was also 
observed with natural duplexes 2 (three AT base pairs and six GC 
base pairs) and 12 (nine GC base pairs). These results suggest that 
the GC and G 4Et C base pairs have similar physicochemical 
properties. It is highly likely that 4Et C forms a G 4Et C base pair 
with G having three hydrogen bonds, since the G 4Et C base pair 
stability would otherwise be weaker, as was shown for a GK base 



"syn" "■»«" 

CH, 

H 1 H I 

GK G^C 

Figure 5. Structure of the 'a/iff and 'jyiT conformations of 4Et C and of the GK 
and G 4El C base pairs. 



pair (K being 2-pyrimidinone), which has two hydrogen bonds 
(22) and is isomorphous with the GC base pair (Fig. 5). 

In single-stranded structures the ethyl group in 4El C has 'syn* 
and 'ant? conformations, the balance of which can be quickly 
shifted to the 'anti ' configuration by hybridization with G. In fact, 
variations in optical density as a function of temperature recorded 
during duplex dissociation into single strands or during association 
of the latter into duplexes are identical (results not shown). 

Several hypotheses could be envisaged to explain the G *C 
base pair stability decrease compared with that of the natural GC 
base pair. 

Substitution of one hydrogen of the amino group at position 4 
of C with one alkyl group (R = Me, Et, Pr, ...) induces a positive 
inductive effect higher than that of the hydrogen, which makes the 
other hydrogen less electrophilic leading to a weakening of the 
hydrogen bond between the H atom at position 4 of C and the 
oxygen atom at position 6 of G. 

The presence of a lipophilic alkyl group whose size is relatively 
large could modify the conformation and/or hydration of duplexes. 

All the results obtained show that the stability of modified 
duplexes built with AT and/or G 4Et C base pairs is not dependent 
on their AT/G 4Et C content ratio. This new system may be very 
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useful for DNA sequencing by hybridization, allowing discrimina- 
tion between perfect hybrids and those involving mismatches. 

CONCLUSION 

We have proven, for the first time, that the stability of duplexes 
made with oligonucleotides of a given length built with AT and/or 
G 4Et C base pairs is independent of their base content in a classical 
buffer solution. Specificity of the G 4Et C base pair is maintained 
and the cooperativity of modified duplex dissociation is similar 
to that of natural DNA duplexes involving natural base pairs. 
These very useful properties make possible the employment of 
this new system in reverse hybridization approaches, using a large 
number of sequences immobilized as a two-dimensional matrix 
for simple and fast analysis of nucleic acid sequences, or in 
biochemical techniques, such as random priming, for the preparation 
of DNA and labeling DNA fragments by an enzymatic method. 
Work is currently in progress to confirm the hybridization 
properties of such modified duplexes with large oligonucleotide 
sequences, to determine the influence of the sequence on thermal 
stability and to prepare a DNA target fragment using d *C 
triphosphate and DNA polymerases. 

ACKNOWLEDGMENT 

This work was supported by a BioMed contract (gene-CT 93-0009). 
REFERENCES 

1 Southem,E.M. (1989) Internationa! patent WO 89/10977. 

2 Drmanac.R., Labat,I., Bruckner,I. and Crkvenjakov,R. (1989) Genomics, 4, 
114-128. 



3 Krapko,K.R., Lysov.Y.R, KhorlinA-A., Ivanov.I.B., YershovA-D., 
Vasiienko,SX., Florentiev.V.L. and MirzabekovA-D. (1991)7. DNA 
Sequencing Mapping, 1 , 375-388. 

4 Melchoir,W.B.,Jr and Von Hippel,RH. (1973) Proc. Natl. Acad. Sci. USA, 
70, 298-302. 

5 Wood,W.L, Gitschier,J., Ladky,L.A, and Lawn,R.M. (1985) 
Proc. Natl. Acad. Sci. USA, 82, 1585-1588. 

6 JacobsJCA., Rudersdorf*., Neill,S.D., Dougherty J.P., Brown,EX. and 
Frisch,E.F (1988) Nucleic Acids Res., 16, 4637-4650. 

7 Maskos.U. and Southern.E.M. (1993) Nucleic Acids Res., 21, 4663-4669. 

8 Ricelli RV. and Benight, A.S. ( 1 993) Nucleic Acids Res., 21, 3785-3788. 

9 lnnis,M.A. ( Gelfand,D,H., SninskyJJ. and White.TJ. (Eds) (1990) PCR 
Protocols: A Guide to Methods and Applications. Academic Press, New 
York, NY. 

10 Hoheisel,J.D. and Lehrach,H. (1990) FEBSLett., 274, 103-106. 

1 1 HoheiselJ.D. (1996) Nucleic Acids Res., 24, 43(M32. 

12 Fritz,HJ., Frommer,W.B., Kramer,W. and Werr.W. (1982) In Gassen,H.G. 
and Lang,A* (Eds), Chemical and Enzymatic Synthesis of Gene 
Fragmants. A Laboratory Manual Verlag Chemie, Weinheim, Germany, 
pp. 43-52. 

13 Sung,W.L. (1981) Nucleic Acids Res., 9, 6139-6151. 

14 Beaucage,S.L. and Caruthers,M.H., (1981) Tetrahedron Lett., 22, 
1859-1862. 

15 Fasman,G.D. (Ed.) (1975) Handbook of Biochemistry and Molecular 
Biology. Nucleic Acids, 3rd Edn. Vol. 1. CRC Press, Cleveland, Ohio, 
pp. 589. 

16 Giannaris,P.A, and Damha,MJ. (1994) Can. J. Chem., 72, 909-918. 

1 7 Butkus,V., Klimasauska^S., Petrauskiene,L., Maneliene,Z., JanulaitisA-, 
Minchenkova.L.E. and Schyolkina,A.K. (1987) Nucleic Acids Res., 15, 
8467-8478. 

1 8 Williams,J.C., Case-Green, S.C., MirJCU. and Southern,E.M, (1 994) 
Nucleic Acids Res., 22, 1365-1367. 

19 Mikita T. and Beardsley,G.P. ( 1 994) Biochemistry, 33, 9 1 95-9208. 

20 Li,S., HacesA-, Stupar,L., Gebeyehu,G. and Pless,R.C. (1993) Nucleic 
Acids Res., 21, 2709-2714. 

21 Ehrlich,M., Norris,K.E, Wang,R.Y.-R, Kuo.KX. and Gehrke.C.W. (1986) 
Biosci. Rep., 6, 387-393. 

22 Gildea,B, and McLaughlin,L.W. (1989) Nucleic Acids Res., 17, 
2261-2281. 



J, Mol. Biol. (1996) 255, 589-603 



JMB ® 

An Approach to Random Mutagenesis of DNA Using 
Mixtures of Triphosphate Derivatives of Nucleoside 
Analogues 

Manuela Zaccolo 1 *, David M. Williams 2 , Daniel M. Brown 2 and 
Ermanno Gherardi 1 



l ICRF Cell Interactions 
Laboratory, MRC Centre 
Hills Road, Cambridge 
CB2 2QH, UK 

2 MRC Laboratory of 
Molecular Biology 
MRC Centre, Hills Road 
Cambridge, CB2 2QH, UK 



"Corresponding author 



We describe a new method for random mutagenesis of DNA based on the 
use of a mixture of triphosphates of nucleoside analogues. The method 
relies on DNA amplification in vitro with Taq polymerase and in the 
presence of the S'-triphosphates of 6-(2-deoxy-[3-D-ribofuranosyl)-3,4-dihy- 
dro-8H-pyrimido-[4,5-C] [l,2]oxazin-7-one(dP) and of 8-oxo-2'deoxyguano- 
sine (8-oxodG). The newly synthesised triphosphate derivative of dP 
(dPTP) is an excellent substrate for Taq polymerase (K m = 22 \M versus 
K m = 9 5 \xM for TTP); it is incorporated in place of TTP and, with a 
-fourfold lower efficiency, in place of dCTR After 30 cycles of DNA 
amplification, equimolar mixtures of the four normal dNTPs and dPTP 
yield the following frequencies of the four transition mutations: A — G 
(4.4 x 10- 2 ), T-C (4.3 x 10- 2 ), G->A (1.1 x 10" 2 ) and C->T (1.0 x lO' 2 ). The 
triphosphate derivative of 8-oxodG (8-oxodGTP) is incorporated opposite 
template adenine and yields two transition mutations (A-»C and T-»G) at 
frequencies of 0.8 x 10~ 2 and 1.2 xlO" 2 respectively. Reaction mixtures 
containing dPTP and 8-oxodGTP results in both dP and 8-oxodG-induced 
mutations and an extensive array of codon changes in the absence of 
insertions and deletions. The method described differs from previous 
mutagenesis procedures in three respects: (1) it enables very high 
frequencies of base substitutions (up to 1.9 x 10" 1 ), (2) it allows control of 
the mutational load via the number of DNA amplification cycles and (3) it 
yields both transition and transversion mutations. The procedure may find 
application in the generation of libraries of DNA and protein mutants from 
which species with improved or novel activities may be selected. 

© 1996 Academic Press Limited 

Keywords: pyrimidine analogue; dP; 8-oxodeoxyguanosine; PCR; random 
mutagenesis; protein engineering 



Introduction 

Protein engineering emerged in the early eighties 
with the advent of site-specific mutagenesis, i.e. 
with the ability to introduce specific amino acid 
substitutions in protein sequences through muta- 
genic oligonucleotides (Hutchinson et a/., 1978; 
Gillam & Smith, 1979a,b). Over the last 15 years, this 
approach has yielded a wealth of data on the role 
of individual residues on the structure and function 
of a variety of proteins and has enabled the 
production of enzymes (Winter et a/., 1982; Estell 
eta/., 1985), antibodies (Neuberger et ah, 1984) and 



Abbreviations used: 8-oxodG, 8-oxo-2'-deoxyguano- 
sine; dP, 6-(2-deoxy-P-D-ribofuranosyl)-3,4-dihydro- 
8H-pyrimido-[4,5-c][l,2]oxazin-7-one. 



transporters (Looker et al t 1992) with improved or 
novel properties. An extended application of this 
approach has allowed the transfer of specific 
functions from one protein to another, the recipient 
protein being, typically a homologue of the donor 
protein. This has been exemplified successfully with 
grafts of antigen-binding loops in antibodies (Jones 
eta/., 1986) and of allosteric functions in transporter 
proteins such as haemoglobin (Komiyama et a/., 
1995). In the latter example, 12 different residues 
located in two different subunits (a, and (3 2 ) had to 
be substituted in order to transplant a bicarbonate 
binding site into human haemoglobin. Clearly, an 
obligatory requirement for success in these exper- 
iments is high -resolution molecular structures of 
the donor and/or recipient proteins or of close 
homologues. 
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Figure 1. Base-pairing of P with A and G and 8-oxoG with A and C. 
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A different and complementary approach to 
protein engineering involves the generation of 
random mutants followed by selection of the ones 
with improved or new phenotypes, an approach 
that mimics an important aspect of natural 
protein evolution. While it is unlikely that this 
approach can generate functions that require the 
concurrent replacement of a large number of 
amino acids (such as the bicarbonate effect in 
haemoglobin), it has nevertheless considerable 
potential for the development of proteins with 
improved properties. 

The power of this experimental approach is 
well illustrated by comparison with natural 
somatic hypermutation of antibody genes. Follow- 
ing antigen binding, antibody-producing cells 
undergo proliferation as well as targeted hyper- 
mutation of the variable (V) segments. The latter 
process, coupled with stringent selection for 
high-affinity variants, generates in a short period 
of time antibodies with binding affinities 100 to 
1000-fold higher than those encoded by their 
germ line counterparts (reviewed by Berek & 
Milstein, 1987). 

The successful adaptation of such a strategy to 
protein engineering in vitro is conceptually attrac- 
tive but it depends critically on the availability of 
efficient random mutagenesis procedures and the 
ability to clone large libraries of mutants and select 
the species with the desired phenotype. In the last 
few years considerable progress has been made 
towards the construction of large libraries, at least 
in prokaryotes (Dower et ai. f 1988; Griffiths et ai., 
1994) and the development of suitable selection 
systems, notably through the advent of the phage 
display technology (Smith, 1985). There is no 
question, however, that there remains a require- 
ment for controlled and efficient random mutagene- 
sis procedures. 



We took the view that for random mutagenesis of 
extended DNA sequences a PCR-based method 
would be advantageous, utilising the triphosphates 
of mutagenic nucleosides as substrates. The 
requirements would be that the substrates should be 
stable under the conditions of PCR cycling, well 
incorporated, be of high mutagenic efficiency and 
between them direct both transition and transver- 
sion point mutagenic changes. There are many base 
analogue transition mutagens whose 5'-triphos- 
phates have been described. These include N 4 - 
hydroxy-2'-deoxycytidine (Muller et ai., 1978), 
2-aminopurine-2'-deoxyriboside (Grossbergher & 
Clough, 1981), 5-bromo-2'-deoxyuridine (Mott etai. t 
1984), 0 6 -methyl-2'-deoxyguanosine (Eadie et a}., 
1984; Snow et al., 1984), N 4 -amino-2'-deoxycytidine 
(Negishi et al, 1985), AP-methoxy-2'deoxycytidine 
(Singer et al. t 1984; Reeves & Beattie, 1985), 
N 6 -hydroxy-2'-deoxyadenosine (Abdul-Masih & 
Bessman, 1986) and 5-hydroxy-2'-deoxycytidine 
and -uridine (Purmal et a]., 1994). However, none 
of these have been shown to fulfil all the above 
criteria. In order to be effective in inducing 
transition mutations, the nucleoside analogue has a 
requirement for a tautomeric constant (ratio of 
imino/amino forms) close to unity. N 4 -Amino-2'-de- 
oxycytidine has a tautomeric constant between 0.1 
and 10 (Brown et al., 1968) and its triphosphate is 
well incorporated as an analogue of dCTP and 
TTP by Escherichia coli DNA polymerase I (Negishi 
et ai., 1985). It is however relatively unstable. 
N*-Hydroxy-2'-deoxycytidine and N*-methoxy-2'- 
deoxycytidine both display tautomeric constants 
between 10 and 30 (Brown et al., 1968; Morozov 
etai., 1982) and thus are able to form Watson-Crick 
base-pairs with both A and G. However, in the 
imino forms of these analogues, the amino 
substituent prefers to adopt the syn conformation 
(with respect to N3) (Morozov et al„ 1982; Shugar 
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er al., 1976), which shields the hydrogen bonding 
face of the molecule and thereby reduces duplex 
stabilities (Anand et al, 1987). For this reason, 
although they are incorporated into DNA by 
various DNA polymerases, they are poor substrates 
(Singer et al., 1984; Reeves & Beattie, 1985). The 
nucleoside dP (Figure 1) is an analogue of 
A/ 4 -methoxy-2'-deoxycytidine, which, by virtue of 
its bicyclic ring structure, is restricted to the and 
conformation. It thus forms stable base-pairs with 
both A and G (Figure 1) comparable to T and C as 
evidenced by the melting behaviour of oligonucle- 
otides containing them (Kong Thoo Lin & Brown, 
1989). NMR data (Nedderman et a/., 1993) and 
crystallography (Moore et al., 1995) show that in 
duplexes P forms a Watson-Crick pair with both A 
and G; with the latter this is in rapid chemical 
exchange with a wobble pairing in solution (Stone 
etal., 1991; Nedderman etal., 1993). As it is widely 
accepted that base-pair geometry is important in 
substrate-template recognition by DNA poly- 
merases, we expected that the triphosphate of the P 
deoxynucleoside (dPTP) would be a good substrate 
for Taq polymerase. 

In contrast, there are very few reports of 
nucleoside triphosphate analogues capable of 
transversion mutations (Purmal et al., 1994; Pavlov 
et al., 1994; Singer et al., 1986, 1989). Of these only 
the triphosphate of 8-oxo-2'-deoxyguanosine (8-ox- 
odGTP) appears to be a substrate for DNA 
polymerases adequate for mutagenesis (Purmal 
et al, 1994; Pavlov et al., 1994). It base-pairs with 
both adenine and cytosine (Figure 1); the analogue 
assumes the normal anti conformation when paired 
with cytosine (Oda et al, 1991), but adopts the syn 
conformation when forming a Hoogsteen pair with 
adenine (Kouchakdjian et al., 1991) and therefore 
elicits A^C transversions (Shibutani et al., 1991; 
Maki & Sekiguchi, 1992; Cheng er al., 1992). 

Here we report on an approach to random 
mutagenesis of DNA using dPTP and 8-oxodGTP. 
The two analogues are efficiently incorporated into 
DNA in vitro by the thermostable Taq polymerase 
and this allows for point mutations to be introduced 
with high efficiency into the replicated target 
sequence by PCR. We also show that their frequency 
can be determined by the number of amplification 
cycles. 

Results 

Synthesis of dPTP and 8-oxodGTP 

The triphosphate of 8-oxodGTP was prepared 
from dGTP using the literature procedure (Mo eta/. t 
1992). The triphosphate analogue dPTP was 
prepared from the free nucleoside (Kong Thoo Lin 
& Brown, 1989) in 34% yield by phosphorylation 
with phosphoryl chloride, followed by reaction with 
pyrophosphate as described (Ludwig, 1981). Both 
triphosphates were purified by reverse phase 
HPLC; dPTP was initially purified by anion 
exchange chromatography. 
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Figure 2. Incorporation into DNA and extension of 
dPTP (A) and 8-oxodGTP (B) by Taq polymerase. As 
template DNA the V region of the human antibody MH22 
was used. A, The PCR reaction mixtures contained the 
following nucleoside triphosphates: dATP, dGTP, dCTP, 
TTP (1); dGTP, dCTP, TTP, dPTP (2); dATP. dCTP, TTP, 
dPTP (3); dCTP, TTP, dPTP (4); dATP, dGTP, TTP, dPTP 
(5); dATP, dGTP, dCTP, dPTP (6); dATP, dGTP, dPTP (7); 
dATP, dGTP, dCTP, TTP, dPTP (8). All dNTPs were at 
500 |iM except in samples 4 and 7 in which dPTP was at 
1 mM. B. The PCR reaction mixture included: dATP, dCTP 
and TTP at 500 \M. Samples 1 to 4 contained dGTP at 
50 nM, 25 pM 12.6 pM and 6.25 nM, respectively and 
8-oxodGTP at 500 nM. Samples 5 to 8 contained the same 
concentrations of dGTP as samples 1 to 4 but no 
8-oxodGTP. C, Amplification by PCR of different target 
genes in the presence of the four natural dNTPs (lanes 1 
to 4); equimolar concentration of the four normal dNTPs 
and dPTP (lanes 5 to 8); equimolar concentrations of the 
four normal dNTPs, dPTP and 8-oxodGTP (lanes 9 to 12). 
The template DNA was: human macrophage stimulating 
protein (MSP; lanes 1,5 and 9); human connexin 43 (lanes 
2, 6 and 10); human connexin 31 (lanes 3, 7 and 11); the 
C chain of human CD3 (lanes 4, 8 and 12). The size of the 
fragments is indicated on the side. 



Incorporation and extension of dPTP 
and 8-oxodGTP in template DNA by 
Taq polymerase 

For the purpose of DNA mutagenesis we used 
Taq polymerase because of the advantage that a 
thermostable enzyme has when performing mul- 
tiple cycles of DNA synthesis and because of its lack 
of proof-reading activity (Tindall & Kunkel, 1988). 

The ability of Taq polymerase to use dPTP as a 
substrate in DNA synthesis was assayed by 
amplifying a target gene in a PCR reaction mixture 
in which one or two of the normal dNTPs were 
replaced by the analogue. The results are shown in 
Figure 2A. In experiments in which a 350 bp DNA 
fragment was amplified, dPTP was able to replace 
TTP (lane 6) yielding amounts of PCR product 
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Time (seconds) 



Figure 3. A, Time course of DNA synthesis in the presence of 12.5 uM [ 32 P]dCTP, dATP, dGTP and TTP (open 
diamonds) or dPTP (filled squares). Primed M13mpl8 was used as a template for DNA synthesis in the presence of 
0.3 unit Taq polymerase. B and C show the rate of DNA synthesis during the first 80 seconds of the reaction in the 
presence of 12.5 jiM dATP, dGTP and [ 32 P]dCTP and the indicated concentrations of dPTP (B) and TTP (C). 



comparable to those obtained with the four normal 
dNTPs (lane 1). A much lower yield of DNA was 
obtained when dPTP was used in place of dCTP 
(lane 5). The analogue was not able to replace 
dATP (lane 2), dGTP (lane 3), both dATP and 
dGTP (lane 4), or both dCTP and TTP (lane 7). 
When we substituted a normal dNTP with dPTP 
the yield of full length product after PCR was 
influenced by the length and sequence of the DNA 
target (data not shown). This was never the case 
when dPTP was added at an equimolar concen- 
tration to the four normal dNTPs (see below). 

In contrast, none of the normal dNTPs could be 
replaced by 8-oxodGTP in a PCR reaction even with 
short template sequences (350 bp; data not shown). 
However, incorporation of 8-oxodGTP and chain 
extension was demonstrated by using 500 u.M dATP, 
TTP and dCTP and limiting concentrations of dGTP 
in the presence of high concentrations of the base 
analogue (Figure 2B). Lanes 5 to 8 show the PCR 
products obtained in the presence of decreasing 
concentrations of dGTP (50 uM, 25 12.6 uM and 
6.25 |iM respectively). The addition of 500 uM 
8-oxodGTP to the same PCR reaction mixtures gave 
an increased yield of DNA (compare lanes 2 to 4 
with lanes 6 to 8). 

In order to assess the general applicability of the 
method, we amplified by PCR DNA fragments of 
different sizes in the presence of dPTP or a 
combination of dPTP and 8-oxodGTP (Figure 2C). 
Using equimolar concentrations of dPTP and the 
four normal dNTPs efficient amplification of target 
DNA sequences from 0.3 to 2.2 kb in size could be 
achieved (lanes 5 to 8), which was comparable to 
that obtained in the absence of base analogues 
(lanes 1 to 4). When equimolar concentrations of 
dPTP, 8-oxodGTP and the four normal dNTPs 
were used, the yield of DNA decreased, especially 
in the case of longer target DNA fragments (lanes 
9 to 12). 



Kinetics of incorporation of dPTP by 
Taq polymerase 

In order to evaluate the performance of dPTP as 
a substrate for Taq polymerase, its rate of 
incorporation was analysed and compared with 
TTP since initial experiments indicated that its 
properties best resembled those of this natural 
rriphosphaie. Figure 3A shows ihc raio of DNA 
synthesis in the presence of dATP, dCTP and dGTP 
plus TTP or dPTP. DNA synthesis was measured by 
the incorporation of [oc- 32 P]dCTP using a primed 
Ml 3 template at 72°C. Incorporation increased 
linearly in the first 80 seconds when either dPTP or 
TTP was present. In order to calculate rates of 
incorporation for different concentrations of sub- 
strate, time points were chosen over intervals in 
which both triphosphate derivatives gave a linear 
rate of synthesis (Figure 3B and C). Concentrations 
lower than 50 uM had to be used for dPTP because 
with higher concentrations the rate of DNA 
synthesis did not increase linearly with time. For 
TTP, concentrations between 1.25 and 25 \iM were 
used to obtain measurable differences in rates of 
incorporation over time (Figure 3C). The apparent 
K m values for TTP and dPTP were determined by 
analysing the experimental data by the direct linear 
plot method (Eisenthal & Cornish-Bowden, 1974). 
The apparent K m for dPTP under these experimental 
conditions was 22 uM, whilst that for TTP was 
9.25 uM. The value for dPTP thus compares 
favourably with those reported in the literature 
(Kong et a/., 1993) for the four natural dNTPS (14 to 
17 uM). 

In order to compare the relative efficiencies of 
insertion of dPTP opposite template adenine and 
guanine residues, respectively we adopted the 
procedure of Boosalis et al. (1987) for the 
determination of steady state kinetics using one of 
two primed synthetic oligonucleotide templates 
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1 23 

5'-d(GGCCTT GATATTCACAAACGAAT) 
3*-d(CCGG AACTATAAGTGTTT GC TT ACCATTCT) 



1 2 3 



3* PRIMER 

5' TEMPLATE 1 (dPTP/dTTP) 



3*-d(CCGGAACTATAAGTGTTT GC TT ACCGTTCT) 5* template 2 (dPTP/dCTP) 

Figure 4. Top: Plots of initial velocities versus [dNTP] for the incorporation by Taq polymerase of dPTP opposite A 
(A), TTP opposite A (B), dPTP opposite G (C), dCTP opposite G (D). Below: Primer and templates used in experiments. 
v is as defined by Boosalis et al. (1987). 



(Figure 4). The 32 P-labelled primer in each case was 
extended by the incorporation of dGTP at two 
positions, followed by dPTP (template 1 and 2), TTP 
(template 1) or dCTP (template 2). Separation of 
the products by PAGE followed by quantitation 
of the radioactivity using a Phosphorlmager 
allowed the determination of the initial velocities 
(Boosalis etal, 1987). Due to the very high extension 
rate of Taq polymerase, the kinetic parameters were 
determined at 55°C. The velocities for the insertion 
of the particular triphosphate opposite template 
Vmax and K m values (uM) for particular insertions 
were determined from non-linear regression fitting 
to the Michaelis-Menton equation. 

Plots of v versus substrate concentration [S] are 
illustrated for the four possibilities PA, PG t TA and 
CG in Figure 4A to D and the kinetic parameters 
and catalytic efficiencies (V max /K m ) are given in 



Table 1. The results indicate that dPTP is virtually 
indistinguishable from TTP in terms of its 
recognition by Taq polymerase. Furthermore, it is 
incorporated approximately three times more 
efficiently opposite template adenine rather than 
guanine residues. 

Km values have been reported for 8-oxodGTP 
with E. coli DNA polymerase I Klenow fragment 
(Purmal et a]., 1994) using a procedure analogous 
to that described here. Values of 63 and 58 uM 
for insertion opposite C and A, respectively were 
obtained at 37°C and compare with an average 
value of approximately 1 u.M for the normal dNTPs 
(Purmal et al., 1994). In addition, the analogue is 
a substrate for the thermostable Tth DNA poly- 
merase and has been shown to generate A C 
transversions at a rate of about 1% (Pavlov et al., 
1994). 



Table 1. Kinetic parameters for dPTP with Taq polymerase 



Template 


Substrate 


V mM (rel) 


Km (uM) 




Relative 
efficiency 


A 


dPTP 


0.86 ±0.06 


5.2+1.5 


16.5 x 10" 


0.99 


G 


dPTP 


0.69 + 0.08 


121 ±4.8 


5.7 x 10" 


0.11 


A 


TTP 


1.02 + 0.11 


6.1 + 1.5 


16.7x10" 


1.00 


G 


dCTP 


1.01 ±0.09 


2.03 ±0.68 


49.8 x 10" 


1.00 


The data are derived from Figure 4. For conditions refer to Materials and Methods. 
V max (rel) is as described by Boosalis et al. (1987). 
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PCR mutagenesis with dPTP, 8-oxodGTP 
and their mixtures 

PCR mutagenesis was performed by amplifying a 
target DNA fragment with Taq polymerase in the 
presence of equimolar concentrations of the four 
normal dNTPs, dPTP and/or 8-oxodGTP The 
product of this first PCR was subjected to a second 
PCR in the presence of the four natural dNTPs in 
order to eliminate the base analogues from the 
target DNA before cloning and transforming the 
DNA into E. coli. In this way the pattern of mutation 
was not influenced by the DNA repair mechanisms 
of the host cell. 

Figure 5 shows the results of a series of 
mutagenesis experiments in which the following 
equimolar nucleotide mixtures were used: the four 
normal dNTPs together with either dPTP (Fig- 
ure 5A), or 8-oxodGTP (Figure 5B) or dPTP and 
8-oxodGTP (Figure 5C). Similar experiments were 
carried out on a second target gene with comparable 
results (data not shown). DNA amplification 
reactions were carried out for a variable number of 
cycles, as indicated in Figure 5 by the number 
preceding the point in the clone designation. The 
data show that a significant number of point 
mutations were generated in the target gene under 
the three experimental conditions tested although 
dPTP clearly proved to be a much more efficient 
mutagen than 8-oxodGTP, as expected. The data 
also clearly show that the number of mutations 
increased as a function of the number of cycles used 
for the DNA amplification reaction. When the 
frequency of mutations was plotted against the 
number of PCR cycles (Figure 6) a linear relation 
was apparent both in the case of 8-oxodGTP and for 
the mixture of dPTP and 8-oxodGTP at least up to 
30 cycles. In the case of dPTP the relation was linear 
for the first 20 cycles. For low numbers of cycles, the 
combination of the two triphosphate analogues 
appeared to produce a total number of mutations 
lower than that produced by dPTP alone, although 
the DNA produced in such reactions contained both 
dP-induced and 8-oxodG-induced mutations (see 
below). Although the clones sequenced after 
different numbers of PCR cycles were obtained 
from separate PCR reactions and mutations ap- 
peared to accumulate over the entire gene sequence, 
it is interesting to note that bases at particular 
positions were mutated more frequently than others 
(Figure 5). 

The total number of bases sequenced in the 
cloned inserts and the mutations generated by 
dPTP, 8-oxodGTP and their combination are listed 
in Table 2. The pattern of mutations produced by 
dPTP, 8-oxodGTP and their combination is shown in 
Figure 7. Thus, of the mutations generated by dPTP, 
46.6% were A - G, 35.5% were T - C, 9.2% were 
G -► A and 8% were C T. The major mutational 
events (A G and T C transitions) resulted from 
the preferential incorporation of dPTP opposite A in 
either strand and subsequent pairing of the 
incorporated dP with G. Incorporation of dPTP 



opposite G in either strand and subsequent pairing 
of dP with A (G -+ A and C — T transitions) 
occurred albeit at lower frequency (-1/4) as 
expected from the results shown in Figure 2 and 
Table 1. In addition to the four transitions mentioned 
above, one T -> G and two A -> T transversions 
were found out of 4093 bp sequenced. 

In the mutants generated with 8-oxodGTP two 
types of transversion mutations were present: 
A - C (38.8%) and T - G (59%) (Figure 7). These 
derive from the misincorporation of 8-oxodGTP 
opposite A in either template strand (Shibutani 
etaL, 1991). One C -* A transversion was found out 
of 5463 bp sequenced. This mutation might be due 
to incorporation of 8-oxodGTP opposite C in the 
template followed by misincorporation of dATP 
opposite template 8-oxodG during subsequent 
replication. This mutagenic mechanism for 8-ox- 
odGTP has been previously reported to occur when 
8-oxodGTP completely substitutes for dGTP. How- 
ever, the frequency of C A transversions was 40 
times lower than that of the A C transversions 
(Cheng et a/., 1992). A very small number of 
additional mutations were also found: two A -> G 
transitions and one G -* A transition. 

From clones mutagenised with the combination 
of dPTP and 8-oxodGTP together, the pattern of 
mutations we observed is shown in Figure 7. All 
types of transition and transversion mutations 
which were expected from the combination of the 
two triphosphate analogues were observed 
although their respective frequencies were slightly 
different from those predicted based on the 
frequencies of dPTP and 8-oxodGTP mutations. The 
mixture of the two analogues also increased the 
frequency of additional mutations (1 x 10~ 3 ). No 
insertions and a single two-nucleotide deletion were 
found using either analogue over a total of 13,307 bp 
sequenced. 

Amino acid replacements induced by dPTP 
and 8-oxodGTP in target sequences 

We analysed the effect of the four transition 
mutations induced by dPTP and the two transver- 
sion mutations induced by the 8-oxodGTP at the 
codon level. Figure 8 shows the results of this 
analysis. The Figure groups amino acids into five 
classes: glycine, non-polar, polar, positively 
charged and negatively charged, and shows the 
codon changes resulting from dPTP mutagenesis 
(circles), 8-oxodGTP mutagenesis (squares) and 
their combination (triangles). Codon changes result- 
ing from a single base substitution are shown as full 
symbols, those resulting from a double substitution 
are shown as open symbols. 

In spite of the clear bias in the mutations induced 
by dPTP and 8-oxodGTP (Figure 7), the use of these 
analogues or their combination allowed extensive 
codon changes to be achieved. The two genes used 
as model templates contained 51 out of the possible 
64 codons (codons not present in either gene are 
marked with an asterisk in Figure 8). Of the 51 
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Figure 5 (A and B). {legend on next page) 
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Figure 5. Mutations in the VH segment of the antibody MH22 (Griffiths et a/., 1994) and corresponding amino acid 
substitutions produced by dPTP (A). 8-oxodGTP (B) and the mixture of the two (C) when used in equimolar amount 
(500 \xM) with the four normal dNTPs in a PCR reaction. The number before the dot in the clone designation represents 
the number of PCR cycles allowed in the presence of the nucleoside triphosphate analogue (s). The second number 
identifies the clones. The asterisk indicates a stop codon. Amino acid numbers, framework regions (FR) and 
complementary determining regions (CDR) are defined as described by Kabat et a}. (1991). See Materials and Methods 
for complete details of mutagenesis conditions. 



codons present, 50 were mutated by dPTP or 
8-oxodGTP or by their combination. 

Of 224 codon changes which were found at least 
once in the database, 49 were silent, 66 changed the 
amino acid to another of the same class, 105 
changed the amino acid to one of a different class 
and four led to termination codons. These results 
thus demonstrate that a broad spectrum of amino 
acid substitutions can be generated by dPTP and/or 
8-oxodGTP mutagenesis. 

Discussion 

Comparison of the current procedure with 
existing methods for random mutagenesis 

We have described a procedure that enables rapid 
and controlled point mutagenesis of DNA se- 
quences via mixtures of triphosphate derivatives of 
base analogues and Tag polymerase. The procedure 
results in a frequency of point mutations, which to 
the best of our knowledge, is superior to that 



obtained with any previous method while yielding 
negligible numbers of nucleotide insertions and 
deletions. 

Two basic strategies have been pursued in the last 
decade in order to generate random mutations in 
DNA. In the first, polymerases lacking proof-read- 
ing activity such as AMVs reverse transcriptase, 
have been used to extend randomly terminated 
3'-ends in the presence of a single dNTP 
(Cunningham & Wells, 1987) or biased dNTP 
mixtures (Lethovaara et ai., 1988). With the 
availability of Taq polymerase, a thermostable DNA 
polymerase lacking proof-reading activity (Tindall 
& Kunkel, 1988), several protocols have been 
developed aimed at harnessing the potential of this 
enzyme for mutagenesis. The error rate of Taq 
polymerase is of the order of 10" 4 to 10~ 5 depending 
on reaction conditions (Eckert & Kunkel, 1990) but 
can be deliberately increased by altering the 
concentration of MgCb and the ratio of the four 
dNTPs or by adding MnCl 2 (Leung et a/., 1989; 
Cadwell & Joyce, 1992). Although we found that 



Random Mutagenesis with Nucleotide Analogues 



597 




PCR cycles 

Figure 6. Frequency of mutation versus number of 
cycles of PCR mutagenesis. Reaction conditions are as in 
Figure 5. The data are derived from two sets of experi- 
ments on two separate genes: the VH segment of the 
MH22 antibody (Griffths et al., 1994) and the Vk segment 
of the NQ2/16.2 antibody (Kaartinen et al., 1983). 



these protocols yielded a biased pattern of 
mutations dominated by A -»• G and T -> C 
transitions (E. Gherardi & C. Milstein, unpublished 
results), other investigators have succeeded in 
employing these procedures for the generation and 
selection of protein variants (Hawkins et al., 1992; 
Gram et al., 1992). 

Over the last few years, however, another 
approach to the random mutagenesis of DNA has 
emerged, namely the incorporation into DNA of 
triphosphate derivatives of pyrimidine and purine 
base analogues with promiscuous base-pairing 
potential. This is an approach initially employed by 
Weissman and colleagues for the generation of 
site-specific mutations in bacteriophage QfS using 
N 4 -hydroxy-2'-deoxycytidine triphosphate (Flavell 
eta]., 1974). The naturally-occurring 2'-deoxyinosine 
has promiscuous base-pairing potential (Ohtsuka 
ef al., 1985) and its triphosphate derivative (dITP), 
although not a good substrate for Taq polymerase 
(Innis ef al., 1988), has also been used in PCR for 
mutagenesis aided by dNTP pool bias (Ikeda ef al., 
1992; Spee et al., 1993). However the frequencies of 
mutation are low and the pattern is dominated by 
transitions. In addition it is evident that the method 
is only suitable for mutagenising short segments of 
DNA due to the poor extension of primers 
terminating in inosine mismatches (Innis ef al., 1988) 



thereby attenuating amplification of mutagenised 
fragments. 

The analogue dPTP is an excellent substrate for 
Taq polymerase, exhibiting K m values comparable to 
those of dCTP and TTP (Table 1). It displays 
essentially identical kinetic parameters to those of 
TTP whilst its incorporation opposite template G is 
about 1 /10th as efficient as dCTP incorporation 
(Table 1). These results are in contrast to those 
previously obtained during replication of dP within 
a template using Taq polymerase (Kamiya ef al., 
1994) where only A was incorporated opposite. 
However, more recent results (F. Hill, D. Loakes & 
D.M.B., unpublished work) in which a template 
containing several dP residues was copied by this 
enzyme, show that both A and G are inserted in a 
ratio of about 2:1. 

We note that N 4 -amino-2'-deoxycytidine triphos- 
phate is incorporated by £ coli DNA polymerase I 
half as efficiently as dCTP and one thirtieth as well 
as TTP (Negishi ef al., 1985); it or a close relative is 
a promising candidate for experiments of the 
present type, for example to adjust the balance of 
transition mutations. Clearly the mutagenic pattern 
of these analogues is directly related to their 
respective tautomeric constants. In the case of dPTP 
the mutational pattern which is observed results 
from the preferential pairing of dP with A, which, 
by comparison with the catalytic efficiency (V max / 
K m ) of insertion opposite G (Table 1), is approxi- 
mately three times as efficient 

8-OxodGTP is known to be incorporated by the 
Klenow fragment of £ coli DNA polymerase I with 
an efficiency of between (4.4 to 7.2) x 10^ of that of 
the normal 5'-triphosphates analogues. Although 
the corresponding values for dPTP have not been 
determined with this enzyme, it is clear that dPTP 
is a superior substrate. Since both dPTP and 
8-oxodGTP are initially incorporated opposite 
template adenine residues, their combination pro- 
duces mutations predominantly originating from 
dPTP (Figure 7). 

Improvements in the current procedure 

Three further developments of the approach 
described herein are envisaged. The first is the 
replacement of dPTP with a closely related 
analogue which displays a tautomeric equilibrium 
that would adjust the balance between all four 
possible transition mutations. The second concerns 
the ratio of transition versus transversion mutations 



Table 2. Numbers and types of mutation produced using dPTP, 
8-oxodGTP and their mixture 

Number of point mutations 
Mutagenic dNTP Bases sequenced Total Coding Silent 

dPTP 4093 384 318 66 

8-oxodGTP 5463 91 65 16 

dPTP and 8-oxodGTP 3751 387 334 53 



See Materials and Methods for details of templates used and mutagenesis 
conditions. 
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Figure 7. Pattern of mutations produced by dPTP, 8-oxodGTP and the mixture of the two. The data are obtained from 
the same experiment used to generate Figure 6 and are expressed as a percentage of the total number of mutations 
sequenced. 



in experiments in which both the dPTP (or related) 
analogue and 8-oxodGTP are used in combination. 
It should be possible to obtain comparable numbers 
of transitions and transversions from mutagenesis 
reactions in which the concentrations of dPTP (or 
related) and 8-oxodGTP are adjusted in order to 
compensate for their different kinetics. Finally it is 
clear that six transversion mutations (C -► G, 
T->G, T->A, A-+T.A-+C and G -► C) either are 
not produced by the dNTP mixture, or else they are 
produced at very low frequencies (Figure 8). Other 
analogues therefore, such as 0 2 -ethylthymidine 
triphosphate, which induces A T transversions 
albeit at a low frequency (Singer et al., 1989), could 
be used in order to extend the range. 

Random mutagenesis versus DNA shuffling in 
protein engineering 

While in vitro point mutagenesis followed by 
selection clearly aims to mimic an important aspect 
of protein evolution, it is clear that nature's strategy 
of protein engineering equally relies on a variety of 
other processes such as gene insertions, deletions, 
duplication and recombination. Procedures are 
being developed which aim to reproduce these 
events in vitro and harness their potential for 
protein engineering. In one such procedure, gene 
fragments obtained by DNase I treatment are 
reassembled by PCR in a process that promotes 
random recombination (Stemmer, 1994a). The 
effectiveness of this approach has been clearly 
illustrated by its application in the engineering of 
P-lactamase mutants, one of which, when ex- 
pressed in E. coli, increased the minimum 
inhibitory concentration (MIC) of a P-lactan 
antibiotic by 32,000-fold compared to wild-type 
enzyme (Stemmer, 1994b). It is of interest, however, 
that the sequence of the improved mutant only 
contained five point mutations compared to the 
wild-type enzyme. This shows that Stemmer's 
protocol leads to an appreciable rate of point 
mutagenesis and that, at least in the P-lactamase 
example, such point mutations are entirely respon- 
sible for the maturation of the enzyme in the 



absence of bona Me recombination. The results, 
nevertheless, reinforce the concept that point 
mutagenesis is a powerful approach for protein 
engineering in vitro. 

Mutational load and protein engineering 

Previous DNA mutagenesis protocols typically 
resulted in relatively low mutational rates. The 
procedure described here, however, can lead to a 
frequency of nucleotide substitutions approaching 
one in five after 30 cycles of DNA amplification. 
This clearly raises the issue of an optimal 
mutational load for protein engineering. 

It seems reasonable to suggest that the lower limit 
of an efficient random mutagenesis protocol may 
aim at introducing, on average, one amino acid 
change per sequence but this may be sub-optimal. 
While a very large number of simultaneous 
substitutions would clearly destroy protein stability 
studies with several model systems suggest that a 
relatively small number of amino acids are critical 
for function and stability 

In T4 lysozyme, for example, substitution of each 
amino acid (except for the initiator methionine) with 
13 different amino acids has shown that more than 
half the positions tolerated all substitutions (Rennell 
et al., 1991). Furthermore, out of 2015 mutations, 
only 173 were seriously deleterious and these were 
confined to 53 out of 163 positions (Rennell et al., 
1991). Studies on the X repressor also demonstrated 
that numerous substitutions in the core of the 
protein are tolerated (Lim & Sauer, 1991). 

Although these studies do not address directly 
the effect of multiple random mutations, they 
suggest, nevertheless, that these would not invari- 
ably result in the loss of protein function, an 
argument reinforced by the results of studies on 
somatic hypermutation of antibody genes. The V 
segments of antibodies isolated in secondary or 
tertiary responses contain a considerable number of 
replacement mutations. However in cases in which 
the role of individual substitutions has been 
analysed, it appeared that only a few mutations 
played a role in affinity maturation (for example 3 
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Figure 8. Codon changes produced by dPTP (circles), 8-oxodGTP (squares) and the combination of the two (triangles). 
Filled symbols indicate a single nucleotide change within a codon, open symbols indicate two nucleotide changes within 
the same codon. Diamonds indicate the presence of nucleotide changes different from the ones expected (see the text 
for details). 0 Indicates the single case in which three nucleotide changes within a codon were found. Amino acids 
are grouped in five classes according to their physico-chemical characteristics: glycine, non-polar, polar, positively 
charged and negatively charged. Asterisks indicate codons which were not present in the two target genes studied. 



out of 19 amino acids in the anti-p-azophenylarson- 
ate antibody; Sharon, 1990). Yet, the frequency of 
amino acid substitutions in the VH and VL domains 
of hypermutated antibodies approaches 1 in 10 (see 
Berek & Milstein, 1987, for a review). The procedure 
described here may allow the optimal mutational 
load for protein engineering to be addressed 
experimentally since this can be readily controlled 
(Figure 6) and libraries of protein mutants carrying 
different numbers of substitutions can be con- 
structed. 



Conclusions 

Finally, we point out the relationship of the 
present work to combinatorial oligonucleotide 
chemistry In the latter a wide variety of short (n) 
repertoires, generally of large sequence content (e.g. 
^4") can be synthesised. Essentially all possible 
sequence isomers are formed and rounds of 
selection have to be applied in a variety of formats 
to identify the sequence of interest. In the present 
approach a DNA sequence, functional ab initio, is 
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amplified under a variable mutational pressure and 
the products then cloned. The mutational frequen- 
cies that were observed, as related to PCR cycle 
number were derived from the few tens of 
randomly picked colonies. Assuming that these 
mutation frequencies hold over all clones carrying 
the insert, the number of mutant inserts sequenced 
represented a very small fraction of the total formed 
in each amplification process. It is reasonable 
therefore to propose that very large libraries of 
mutants can be constructed with the present 
method and it should be interesting to compare the 
mutants isolated from such libraries with those 
obtainable from large synthetic repertoires. 

Materials and Methods 

6-(2-Deoxy-p-d-ribofuranosyl)-3 ) 4-dihydro-8H-pyrimido 
-[4,5-c][1,2]oxazin-7-one S'-triphosphate, 
triethylammonium salt (dPTP) 

The P deoxynucleoside (54 mg, 0.2 mol; Kong Thoo Lin 
& Brown, 1989) was dried in vacuo over P 2 0 5 at 80°C 
overnight, then suspended in dry trimethylphosphate 
(0.5 ml) under argon. The flask was cooled in an ice-bath 
whilst phosphoryl chloride (21 [x\) was injected with 
stirring. After stirring in the ice bath for 45 minutes, a 
vortexed mixture of bis-tributylammonium pyrophos- 
phate (0.5 M in anhydrous DMF, 1.0 ml), tributylamine 
(0.2 ml) and anhydrous DMF (0.4 ml) was added with 
rapid stirring in ice, followed after ten minutes by 
triethylammonium bicarbonate solution (pH 7.5, 0.1 M, 
20 ml). After one hour, the sample was purified on 
Sephadex A25 column (25 mm x 330 mm) at 4°C eluting 
with a linear gradient of triethylammonium bicarbonate 
(0.05 to 0.8 M, 31). Fractions containing the S'-triphos- 
phate (0.48 and 0.54 M buffer) were evaporated and 
purified further by reverse phase HPLC (Waters CI 8, 
7.8 mm x 300 mm) using a linear gradient over 20 min- 
utes of 0 to 4.5% acetonitrile in 0.1 M triethylammonium 
bicarbonate (pH 7.5) at 2.5 ml/minute. Appropriate 
fractions were evaporated and residual buffer removed 
by coevap oration with methanol to afford the pure 
triphosphate as the tetraWstriethylammonium salt (253 
Azeo at pH 7, 0.067 mmol, 34%). 5( 2 H 2 0) ppm -9.57 
(d. y-P), -10.34 (d, a-P), -22.02 (t, P-P). Approx. HPLC 
retention time = 18.5 minutes. 

8-Oxo-2'-deoxyguanosine S'-triphosphate, 
triethylammonium salt (8-oxodGTP) 

This was prepared essentially as described (Mo et ai., 
1985). dGTP (trisodium dihydrate, 58.48 mg, 96 \imo\) in 
100 mM sodium phosphate (8 ml) containing 30 mM 
ascorbic acid and 100 mM hydrogen peroxide was 
incubated at 37°C for four hours in the dark. The product 
was purified by reverse phase HPLC (Waters CI 8, 
19 mm x 300 mm) using a linear gradient over 30 minutes 
of 0 to 15% acetonitrile in 0.1 M triethylammonium 
bicarbonate (pH 7.5) at 7.5 ml/minute. The pure 
triphosphate was obtained as the tetraic/striethylammo- 
nium salt as described above. The absorbance spectrum 
was identical to that described (Mo et a/., 1985; Purmal 
et a/., 1994) (12.3 A Ui , 10.3 A m at pH 7, 5.2 jimol, 5.4%). 
5(^0) ppm -9.68 (d, y-P), -10.46 (d, a-P), -22.40 (t, P-P). 
Approx. HPLC retention time = 27.9 minutes, dGTP 
26.0 minutes. 



Mutagenesis 

For mutagenesis experiments, lOfmol template DNA 
were amplified using 0.5 jil of AmpliTaq polymerase 
(5 units/^1, Applied Biosystems) in a 20^1 reaction 
containing the appropriate sense and antisense primers at 
0.5 liM, 2 mM MgCh, 10 mM Tris-HCl (pH 8.3), 50 mM 
KG, 1 g/1 gelatine and dATP, dCTP, dGTP, TTP, dPTP 
and/or 8-oxodGTP, each at 500 \xU. After various 
numbers of cycles (92°C for one minute, 55°C for L5 
minute, 72°C for 5 minutes) 1 \i\ of the amplified material 
was used in second PCR in which the above conditions 
were used except that no dPTP or 8-oxodGTP was added 
to the reaction mixture. The product of the second PCR 
was restricted with BstEII and PstI and cloned into the 
M13VHPCR1 vector (Jones etai., 1986). Sequence analysis 
of single-stranded DNA prepared from phage isolates 
was performed by the Sanger procedure (Sanger et a}., 
1977) using Sequenase Version 2 (USB, Cleveland, OH) 
according to manufacturer's instructions. 



Kinetic parameters 

Method A: Determination of K m values for dPTP and 
TTP using primed M13 DNA 

X m values were determined using primed single- 
stranded (ss) Ml 3 DNA as template. A mixture of 0.12 nM 
ss M13mpl8 template and 2.5 jiM -40 reverse primer was 
heated at 93°C for three minutes and annealed at 25°C for 
20 minutes. 2.5 \i\ of this stock were diluted in 20 fil 
reaction mixture containing 2 mM MgCl 2 , 10 mM 
Tris-HCl (pH 8.3), 50 mM KC1, 1 g/1 gelatine, dATP, dGTP, 
dCTP at 12.5 \iU each, 1 jil [a- 32 P]dCTP (3000 Ci/mmol, 
lOjiCi/^tl) and different amounts of TTP or dPTP as 
indicated. Reactions were initiated by the addition of 0.3 
unit Taq polymerase to the prewarmed (72°C) reaction 
mixture. Samples (10 jjiI) were withdrawn at different time 
points and reactions stopped by addition of 8 \x\ of 20 nM 
EDTA in 95% formamide. To remove unincorporated label 
the reaction product was precipitated three times in 
ethanol/sodium acetate and DNA synthesis was 
measured as percentage of [ 32 P)dCTP incorporation. K m 
values were determined by the direct linear plot method 
(Eisenthal & Cornish-Bowden, 1974). 



Method B: Determination of kinetic parameters for 
insertion of dPTP and TTP opposite template A, and 
dPTP and dCTP opposite template G 

Specific kinetic parameters for the insertion of dP 
opposite A or G, and T opposite A and C opposite G were 
determined essentially as described (Boosalis eta/., 1987) 
using a 23 base primer and one of two templates (Fig- 
ure 4). Both the primer and template oligonucleotides 
were purified DMT-on by reverse phase HPLC (Waters 
C18, 7.8 mm x 300 mm) with a linear gradient of 2.5 to 
50% acetonitrile in 0.1 M triethylammonium bicarbonate 
(pH 7.5) at 2.5 nl/minute. Appropriate fractions were 
evaporated and the oligonucleotides detritylated with 
80% acetic acid for 30 minutes, then repurified by 
RP-HPLC as described above. Anion exchange chroma- 
tography (Hichrom, SAX 10) using an isocratic gradient of 
21 mM potassium phosphate (pH 6.3) in 60% formamide 
solution for two minutes then a linear gradient of 21 to 
300 mM in 33 minutes at 2.5 ml/minute followed by 
dialysis gave the pure oligonucleotides. 
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Primer end labelling. The primer was 5 '-end labelled 
with 32 P in a reaction buffer (30 jal) of 70 mM Tris-HCl 
(pH 7.6), 10 mM MgCl 2 and 5 mM DTT, containing 
6.7 uM primer and 2,2 uM [y- 32 P]ATP (3000 Ci/mmol), 
and five units of T4 polynucleotide kinase. The solution 
was incubated at 37°C for 30 minutes then cold ATP 
(10 mM, 5 ul) added and after a further 10 minutes, the 
reaction was terminated by heating at 65°C for 15 
minutes. The labelled oligonucleotide was purified by 
20% PAGE and eluted in 3 M sodium acetate solution 
(pH 5.5). The primer was then ethanol precipitated, 
redissolved in water and desalted on a Sephadex G-25 
(NAP- 10) column. 

Primer-template annealing. The 32 P-labelled primer 
was annealed to the respective template in a 500 ul 
of lx buffer (10 mM Tris-HCl, pH 8.3, 50 mM KC1) 
containing 78.9 nM template and 86.8 nM primer at 
95°C for five minutes, then cooled slowly to room 
temperature. 

Determination of kinetic parameters 

Taq DNA polymerase (Perkin Elmer Native Taq five 
units/ul) was diluted with buffer containing 20 mM 
Tris-HCl (pH 8.0), 100 mM KC1 0.1 mM EDTA, 1 mM 
DTT and 50% glycerol to 0.2 unit/ul, then diluted to 
0.0066 unit/id in 1 x buffer. Solution A contained diluted 
Taq polymerase (30 ul) and annealed primer-template 
solution (225 jjil) in 1 x buffer. Solution B contained 
355 uM dGTP, 5.33 uM MgCl 2 and dPTP, dCTP or TTP at 
an appropriate concentration. Solution A (25.5 ul) contain- 
ing Taq polymerase and annealed primer-template and 
solution B (25 ul) containing dNTPs/MgCl 2 were prein- 
cubated at 55°C for four minutes and reactions were 
initiated by the addition of 10 ul of B to solution A. Final 
reaction mixtures contained 1.5 mM MgCl 2 , 100 mM 
dGTP, 50 nM annealed primer- template and concen- 
trations of dPTP of between 0.5 and 100 mM, and dCTP 
and TTP between 0.5 and 30 mM. For dPTP concen- 
trations between 0.5 and 100 mM or dCTP and TTP 
concentrations between 0.5 and 30 mM initial velocities 
were determined up to ten minutes as described 
(Boosalis et al., 1987) by terminating 8 ul aliquots of 
reaction mixture with 12 ul of 95% formamide/10 mM 
EDTA containing 0.5% (w/v) bromophenol blue. Samples 
were denatured by heating at 95°C for three minutes, 
then 10 ul were electrophoresed at 38 W for three hours 
on a 16% (w/v) denaturing polyacrylamide gel and 
analysed using a Molecular Dynamics Phosphorlmager. 
Bands were quantified using the GelTrak program (Smith 
& Thomas, 1990). Initial velocities were constant with 
these substrate concentrations for up to ten minutes, 
therefore samples at r = four minutes (dPTP) or three 
minutes (dCTP and TTP) using varying dNTP concen- 
trations were chosen for the determination of kinetic 
parameters. 
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ABSTRACT 

The synthesis and enzymatic incorporation into RNA of 
the hydrogen bond degenerate nucleoside analogue 
6-(p-D-ribofuranosyl)-3,4-dihydro-8H-pyrimido[4,5-c]- 
[1,2]oxazin-7-one (P) is described. The S'-triphosphate 
of this analogue is readily incorporated by T3, T7 and 
SP6 RNA polymerases into RNA transcripts, being 
best incorporated in place of UTP, but also in place of 
CTP. When all the uridine residues in an HIV-1 TAR RNA 
transcript are replaced by P the transcript has 
similar characteristics to the wild-type TAR RNA, as 
demonstrated by similar melting temperatures and CD 
spectra. The P-substituted TAR transcript binds to the 
Tat peptide ADP-1 with only 4-fold lowered efficiency 
compared with wild-type TAR. 

INTRODUCTION 

Nucleoside bases differing from the normal purines, adenine and 
guanine, and the pyrimidines, thymine (uracil) and cytidine, are 
uncommon in DNA but relatively abundant in RNA. and in 
particular in transfer RNAs. The vast majority are the result of 
post-tixinscriptional modifications of the nucleic acids by specific 
enzymes. Although a specific role can seldom be assigned to these 
modifications, their importance can be inferred from their 
remarkable phylogcnetic conservation. Analogues of the natural 
bases which could be incorporated enzymatically in vitro could 
provide useful functional alterations to synthetic RNA transcripts. 
Early mutagenesis studies have demonstrated that /^-hydroxy - 
cytidine triphosphate ( 1. Fig. 1 ) is efficiently incorporated by the 
DNA-dependent RNA polymerase from Micrococcus lute us ( 1 ): 
the phage Q0. T2 and Escherichia coli Pol II polymerases also 
incorporate it into RNA or DNA (2-4). The addition of the 
electronegative element to the N 4 -amino group alters the 
tautomeric ratio of the base; the alternative tautomers can 
base-pair with either adenine or guanine. The tautomeric constant 
(K-r) for 1 -methyl-yV 4 -hydroxycytosine has been measured giving 
a ratio of 10:1 in favour of the ox imino-form in water (5), and thus 



correlates qualitatively with the analogue being recognised as 
either of the natural pyrimidines. 

The N 4 -hydroxyl group can adopt either a syn or an and 
conformation; the preferred syn form (6), however, interferes 
with hydrogen bonding of Watson-Crick base-pairs, and this is 
more evident in the case of N 4 -methoxy-derivatives (6). To 
constrain the hydroxy 1 group in an anti conformation we have 
previously synthesised a ribonucleoside containing a 5-membered 
second ring (2, Fig. 1 ) (7), but this compound proved too unstable 
to allow conversion either to its phosphoramiditc monomer or its 
S'-triphosphate. Strain in the 5-membercd ring resulted in its 
cleavage during further reactions. The 2'-deoxynucleoside (dP) 
containing a 6-membered second ring proved to be stable. The 
properties of dP and its 5'-triphosphate have been intensively 
studied (8-11). The ribonucleoside 6-(P-t>ribofuranosyI)-3,4- 
dihydro-8H-pyrimido[4 t 5-c]lI.2]oxazin-7-one (rP) (3) has now 
been synthesised, by a route modified from that described for the 
deoxynucleosidc, and converted to its 5 '-triphosphate. Incorporation 
of the triphosphate by three widely used RNA polymerases, those 
of the bacteriophages T3, T7 and SP6 has been examined. The 
properties of HIV- 1 TAR RNA transcripts synthesised by T3 
RNA polymerase using rPTP in place of UTP or CTP have also 
been investigated. 

MATERIALS AND METHODS 

General methods 

! H NMR spectra were obtained on Bruker WM-250 and DRX 
300, and 3I P NMR spectra on a Bruker WM-250 spectrometer. 
NMR spectra were obtained in d 6 -DMSO. 31 P NMR spectra are 
referenced to phosphoric acid. Mass spectra were recorded on a 
Hewlett-Packard G205A Maldi-TOF spectrometer with positive 
polarity, in a matrix of a-cyano-4-hydroxycinnamic acid in 
MeCNiH^O (1:1) with 3% trifluoroacetic acid. UV spectra were 
recorded on a Perkin Elmer Lambda 2 spectrophotometer fitted 
with a Peltier cell and samples were dissolved in 1% aqueous 
methanol. TLC was carried out on pre-coated F254 silica plates 
and column chromatography with Merck kieselgel 60. Unless 
otherwise stated reactions were worked up as follows: after 
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removal of the solvent, the product was dissolved in chloroform 
and washed with aqueous sodium bicarbonate solution. The 
combined organic fractions were dried over sodium sulphate and 
evaporated. 

Synthesis 

5-AHyluridine was prepared according to literature procedures 
(12). This was then converted to the 2'.3',5'-tris-(tert-butyI- 
dimethylsilyl) derivative (4) as previously described (13). 

(51 To a solution of 2',3',5 / -tri-(ten-butyldimet'hylsilyl)-5-allyl- 
undine (4) (9.5 g, 15 mmoi) in acetone (250 ml) was added 
/V-methylmorpholine-A'-oxide (5.4 g, 46 mmol) followed by 
potassium osmate dihydrate (70 nig) in water ( 10 ml) over 5 min. 
The solution was then stirred at room temperature overnight, and 
the solvent was removed. The product was worked up as 
described and then chromatographed (CHCI 3 /5% MeOH) to »ivc 
« white foam. Yield 9.S4 g, 98%. ' H NMR 8 (p.p.m.) -0. 1 1-0.09 
(loH.m.6xSiCH^.0.79-0.X9(27H.m,3xC(CH,),) ^0-3 i| 
(2H. m. C7Y;CH). }.52-X55 (3H, m. O/C/hOHl T 68-1 77 
<-«, m. H5'. H5"). 3.91-3.92 ( I H. m. H2'). 4*01 <IH t HV) 



4. 19-1.2 1 ( I H. m. H4'), 4.49 ( I H. t, OH). 4.56 ( I H, d. OH) 5 SS 
( I H. t. J = 7.4 Hz. H 1 '), 7.36 ( 1 H, d. J = 5.3 Hz, H6). 1 1 .40 ( I H 
s. NH). UV ^ 268, 208; 230. M/z 684.306 (M+Na) + ! 
700.809 (M+K) + . 

?J'S-Trh(rea-btinldimeM (7) 
The above diol (5) (8.3 g, 12.6 mmol) was dissolved in dioxanc 
( 125 ml, 10 ml/mmol) and water (25 ml, 2 ml/mmol) added. To 
this was then added a solution of sodium periodate (8.0 c, 
37 mmol) in water (25 ml, 2 ml/mmol) and the solution stirred at 
room temperature for 2.5 h. The solution was concentrated then 
worked up as usual to give the crude aldehyde, 6. This was 
dissolved in THF (200 ml) and sodium borohydride (0.5 s. 
1 3 mmol) added followed by water ( 1 ml) and the solution stirred 
at room temperature for I h. The reaction was quenched with 
acetic acid, the solution evaporated and the product worked up as 
usual and chromatographed (CHCIV3% MeOH) to give an 
off-white foam. Yield 4.42 g, 56%, remainder (3.06 2) unreacted 
diol 5. 'H NMR 5 (p.p.m.) -0.1 0-0. 10 (18H, m. 6x SiCHO 
0.79^).90 (27H, m, 3x C(CH,h>, 2.33-2.37 (2H, m.' 
CH 2 CH z OHi 3.36-3.43 (2H, m, CHiC^OH), 3.69-3.79 (2H 
m, H5', H5"), 3.91 (IH.br s, H2'). 4.02^.04 (IH. m. H3'). 
4. 1 8-4.22 ( 1 H. m. H4'). 4.6 1 ( I H, t, OH), 5.86 ( I H. d. J = 7. 1 H/ 
H 1 '). 7.40 ( 1 H. s, H6). 1 1 .40 ( 1 H, s. NH). U V 267, 209; X nVm 
230. M/z 654.487 (M+Na)+ 671.172 (M+K) + . 

uridine (8). To a solution of the alcohol (7) (3. 1 g, 5 mmol) in THF 
(50 ml) was added triphenyl phosphinc (2.6 g, I mmol). 
N-hydroxyphthalimide ( 1.6 g, I mmol) and then diisopropylazodi- 
carboxylatc (DIAD) (2 g, I mmol) and the solution stirred at room 
temperature overnight. The solution was then evaporated, worked 
up as described and then chromatographed (twice, CHCiyi% 
MeOH) to give a pale yellow foam. Yield 3.65 u. 96%. 'H NMR 
5 (p.p.m.) -0.1 1 -0.J0 ( I8H, m, 6x SiCHO, 0.58-0.91 (27H. ni 
3x C(CHj)j), 2.61-2.71 (2H, m, CY/iCHON), 3.60-3.69 (2H 
m. CH 2 C/y : ON). 3.72-3.90 (3H. m. H2\ H5\ H5"), 4.04 ( 1 H, br 
s. H3'). 4. 14-4.27 ( I H. m, H4'). 5.87 (IH.d.J = 7 Hz, HI '), 7.56 
( 1 H. s. H6). 7.5^7.64 (4H. m. Ph). 1 1 .54 ( I H, s, NH). U V X nm 
26x 220: ^ llin 245. M/z 798.201 (M+Na) + . 814.482 (M+K) + .' 

/ -(2 ' 3'5'-Tri-( tirhburyldimethylsiW^ 

«y>lo-5H2-phthalimidooxyeriiyl)-^ (9). To a 

solution of 1,2.4-triazole (4.8 g. 6.95 mmol) in dry acetonitrile 
(75 ml) at 0°C was added phosphorus oxychloritle (1.3 ml. 
1.4 mmol) and the solution stirred at 0°C for 15 min. To this was 
then added triethylamine ( 1 1.6 ml. 8.3 mmol) and the solution 
stirred for a further 1 5 min at 0°C. To the solution was then added 
a solution of 2'.3'i / -iri-acrt-butyldimethyIsilyl)-5-(2-phthalimido- 
oxyethyD-uridinc (8) (3.6 g. 4.6 mmol) in acetonitrile (25 ml) and 
the solution stirred at room temperature overnight (product has 
same R,as starting material). The solution was evaporated and 
worked up as described and then chromatographed (CHCiyi% 
MeOH) to give an off-white foam. Yield 2.44 a, 64 r /r. ! H NMR 
5 (p.p.m.) -O.I 1-0.12 (I8H. m, 6x SiCHO, 0.83-0.91 (27H. m. 
3x C(CH 3 ) 3 ). 3.23-3.30 <2H, m, C/ACH.ON), 3.79-3.83 (2H. 
m. CH : C/y : ON), 3.93-4.02 (I H, m, H2'), 4.03-4.08 ( I H. br s, 
H3'). 4.2 1—4.38 (2H. m. H5'. H5"). 4.38-4.40 ( I H. m. H4'), 5.87 
( IH. d. J = 4 Hz. HI'). 7.5 1-7.64 (5H, m, H6. Ph). 8. 19 ( I H, s. 
tnazole CH), 9.36 (IH. s, triazole CH). UV X m:n (nm) {W7< 
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MeOH/FbO) 331, 264. pH 12 X, nax 267. M/z 828.696 (M+H)\ 
849.629 (M+Na) + , 865.969 (M+K) + . 

6-(2'J'S-Trh(rerr-bunldinK^^ 

hydm-8H-pyrimido[45<']ll2]oxazin-7-one (10). The above 
triazole (9) (2.0 g, 2.4 mmol) was dissolved in dioxane saturated 
ammonia (25 ml) and the solution stirred at room temperature 
overnight. The solution was evaporated and the product chromato- 
graphed (CHCI 3 /2% MeOH) to give an off-white foam. Yield 
0.88 g, 58%. 1 H NMR 5 (p.p.m.) -0.06-0.09 ( 1 8H, m, 6x S iCH,), 
0.81-0.89 (27H, m, 3x C(CH 3 ) 3 ), 3.31-3.34 (2H, m, 
O/ : CH : 0N), 3.66-3.74 (2H, m, H5', H5"). 3.77-3.87 (3H, m 
H2'. CH.O/.ON), 3.99-4.0! (IH, m, H3'), 4.09^.13 (IH m 
H4'), 5.83 ( I H, d, J = 7.5 Hz, H 1 '), 6.79 ( I H, s, H6), 1 0.63 ( 1 H. 
s, NH). UV (nm) (MeOH) 298 (e = 7400); ^ in (nm) 262. 
PH 1 ^ ax 304 (e = 12 400). pH 12 ^ 303 (e = 7700). M/z 
651.697 (M + Na) + . 

6-(^D~RibofiowwsyI)~3,4'dihydro 

z'm-7-one (3). The above product (10,' 0.85 g, 1.35 mmol) was 
dissolved in methanol (25 ml) and ammonium fluoride (0.3 g, 
8.1 mmol) added and then the solution heated at 50°C overnight. 
The solvent was removed and the product chromato graphed 
(CHCiy20% MeOH) to give a white solid. Yield 0.3 1 e, 80% 1 H 
NMR 5 (p.p.m.) 3. 15 (2H, d, J = 5 Hz, C/ACH^ONU.45-158 
(2R m, H5'. H5"), 3.74-3.76 ( I H. m, H2'), _ 3.82'(2H. t, J = 5 Hz 
CH^C/fjON). 3.9 1 -3.97 (2H. m, H3',H4'), 4.98-5.0 1 (2H m 
OH). 5.22 ( 1 H. d. OH), 5.72 (IH, d, J = 5.9 Hz, H I '). 7.00 ( I H 
s, H6), 10.50 ( 1 H, s, NH). UV \ m , K (nm) (H-.0) 295 (e = 5 100), 
23 1 (£ = 6 1 00), ^ lin 260. pH 1 ^ m;ix 30 1 (e = 7800). pH 1 2 X m ,, 
302 (e = 5000). e260 (jiM) 2.5. lef dP: X nVA , 295 (e = 6300), 231 
(e = 7000). pH I X UVAX 301 (e = 10 000). 224 (e = 4555). pH 12 
A,,™ 302 (e = 6400)|. m/z 286.727 (M+H) + , 308.71 1 (M+Na) + . 

6-($-D~Riboftircwosy!)-3 A-dihydro-8H-pyrimido[45-c][i 2joxa- 
zm-7-one-5' -triphosphate (rPTP). To a solution of the nucleoside 
3 (80 mg, 0.28 mmol ) in trimcthyl phosphate (0.5 ml) and triethyl 
phosphate (0.5 ml) under nitrogen at 0°C was added phosphoryl 
chloride (36.7 0.4 mmol). the reaction stirred at 0°C for 3 h. 
To the solution was then added tributylammonium pyrophosphate 
(3 ml, 0.5 M in DMF) and tributylamine (0.4 ml) and the solution 
stirred at 0°C for 45 min when the reaction was complete. The 
reaction was quenched with TEAB (20 ml, I M, pH 8.5) and then 
evaporated to dryness. The product was dissolved in 15 ml water 
and purified by anion exchange HPLC (buffer A, water: buffer B 
0.3 M TEAB pH 8.5, gradient 0-100% B over 80 min), and then 
further purified by reverse phase HPLC (RP C-18, buffer A. 
0.1 M TEAB: buffer B, 0. 1 M TEAB, 50% acetonitriie, cradient 
O-I00% B over 40 min). Yield 75.3 ^mol (28%) as 25 mM 
solution in water. ?I P NMR 5 (p.p.m.) (DO/EDTA) 5 -10 71 
(d, y-P). -1 1 .54 (d. <x-P), -23.22 (t. P-P). M/z 526.429 (M+H) + 



Melting experiments 

Melting transitions were measured at 260 nm in 100 mM sodium 
phosphate (pH 7) at an oligomer strand concentration of - 1 .8 jiM. 
Absorbance versus temperature for each duplex was obtained at 
a heating and cooling rate of <).5°C/min. and the meltinu 
transitions (7 m ) determined as the maxima of the first differential 
curves with an error of ±1 °C. Thermodynamic calculations were 
carried out as described by Gralla and Crothers (14). 



Circular dichruism (CD) measurements 

CD measurements were carried out on a Jobin- Yvon Dichro-raph 
CD6 spectrometer. Data was collected at 0.25 nm intervals, and 
measurements were carried out between 1 90 and 330 nm at 20°C. 
Five such runs were averaged, calculated net of buffer and factor 
3 smoothed. Samples were prepared with an oligonucleotide 
concentration of A :60 = 0.5 in 10 mM sodium phosphate (pH 7) 
butter (15). with a path length of 1 mm. 

Polymerase reactions 

Polymerase incorporation assays were carried out usins the 
Riboprobe® System (Promega) using SP6, T3 and T7 RNA 
polymerases, and using pGEM® Express Positive Control 
Template (Promega). Reactions were carried out according to the 
manufacturers instructions. rPTP was used in place of either CTP 
or UTP, and at both Ix and I Ox NTP concentrations. Products 
were electrophoretically separated on I % agarose gels containing 
ethidium bromide and visualised under UV light." 

TAR RNA synthesis 

Wild-type TAR and the TAR mutant G 26 :C3y to C:G RNAs were 
transcribed from the plasmids BT0 and~BT76. respectively (16). 
In these plasmids, the TAR sequence abuts directly the T3 
promoter. The plasmids were digested with EcoRl prior to the 
transcription reactions. Transcripts of 60 nucleotides were 
generated using either the Riboprobe® System (Promesia), or for 
large scale synthesis RiboMAX® system (Promega). Small scale 
reaction mixtures for TAR RNA synthesis (typically containing 
40 mM Tris-HCI, pH 7.9, 6 mM MaCk 2 mM spermidine, 
10 mM NaCI, 2 mM DTT, 40 U RNasin, 400 jiM NTPs, 1 jig 
template DNA. 40 U T3 RNA polymerase, and 20 jiCi 
[a--P]GTP in 50 ul solution) were incubated for I h at 37°C. 
When rPTP was used instead of UTP and/or CTP, the concentration 
of rPTP in the reaction was 400 u;M ( Ix concentration) or4 mM. 
The products were electrophoretically separated usinji 12% 
polyacrylamide gels ( 165 x 200 x 1 mm) containing 7 M urea at 
35 W for I h. 

Large scale reaction mixtures contained 80 mM HEPES-KOH, 
pH 7.5, 24 mM MgCI 2 , 2 mM spermidine, 40 mM DTT, 7.5 mM 
NTPs, 5 u;g template DNA, and 1 0 uJ of T3 enzyme mix in 1 00 uJ 
solution. The synthesis of TAR was also carried out using lOmM 
rPTP instead of UTP. The reaction mixtures were incubated at 
37°C for 4 h. The products were electrophoretically separated 
using 67c polyacrylamide gels (350 x 200 x I mm) containinu 
7 M urea at 35 W for 2.5 h. The transcripts were visualised by 
autoradiography or UV shadowing for small and large scale 
syntheses respectively. The correct band was cut out and the RNA 
was extracted with 0.5 M ammonium acetate/I mM EDTA. The 
extracts were then desalted using NAP- 10 columns (Pharmacia). 

Digestion of TAR RNAs 

Digests were carried out as described (17). and the nucleosides 
separated by HPLC on a Waters pBondapak™ RP-CI8 (3.9 x 
300 mm) column. 

RNA bandshirts 

RNA bandshifts were carried out according to the methods 
described previously { 16). In summary. lOOOoVp.m. (-2 nM) of 
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[a -'-PJGTP labelled TAR RNA (prepared as described above but 
us.ng 5-fold excess of rPTP, i.e. 2 mM) was incubated at room 
temperature for 5 min with various concentrations of ADP-I 
peptide (50- 1 000 nM) in TK buffer (50 mM Tris, pH 7.5, 20 mM 
KC1) containing 0.1% Triton X-100 and 100 mM DTT. The 
mixtures were electrophoretically separated (10 W, 1 h) on S% 
native polyacrytamide gels ( 1 65 x 200 x 1 mm) containing 0.1% 
Triton X-100. The bands were visualised by overnight exposure 
to autoradiography film. The autoradiograph was scanned usins 
a Molecular Dynamics Scanning Imager 300A-T. " 

RESULTS 

The synthesis of the ribosyl analogue, rP, is shown in Figure "> 
5-Allylundine (12). was silylated with /-butyldimethylsilvi 
chloride to give (4) (13). This was then convened to the 
5-hydroxyethyluridine (7) by dihydroxylation of the olefinic bond 
with potassium osmate//V-methylmorpholine-/V-oxide followed bv 
penodate cleavage to the aldehyde (6). Curiously, the periodat'e 
reaction was not effective using THF/water or acetone/water but 
reaction was obtained using dioxane/water as solvent The 
aldehyde 6 was not isolated but converted immediately to the 
alcohol. 7. by reduction with sodium borohydride. This was then 
converted to the bicyclic P analogue in a manner similar to that 
described lor the deoxynucleoside (8) by a Mitsunobu reaction of 
he alcohol (7) with /V-hydroxyphthalimide. triazolylation and 
then ring closure using ammonia in dioxane. Finally, the silylated 
derivative (10) was deprotected using ammonium fluoride in 



methanol to give the free nucleoside. This was then converted to 
as 5 -triphosphate derivative. 

d JA SCre , en f ° rlhe incor Poration of the analogue into RNA by the 
KNA polymerases of the bacteriophages SP6. T3 and T7 the 
positive control template from the pGEM Express Positive kit 
(Promega) was used. UTP or CTP were replaced by rPTP in the 
polymerase transcription reactions. The 2'-deoxynucleoside 
triphosphate, dPTP. is incorporated opposite dA or dG by Tact 
polymerase in PCR reactions. However, neither TTP or dCTP 
could be entirely replaced by dPTP (10). Both as a substrate 
triphosphate and as a template for Taq polymerase. dP resembled 
T more than dC ( 1 8). It had also been demonstrated that, in terms 
of hybridisation. P:A base pairs are equivalent to T:A base pairs 
whilst P:G pairs are slightly destabilising when compared with 
UG pairs in deoxyribo-oligomers (8). Usin» T7 polymerase, full 
length products were formed: when used to replace CTP a 
distinct, but dirferent. product was obtained. Product yields were 
lowerthan the controls using the four NTPs (Fia. 3). but when the 
rPTP concentration was increased 10-fold compared with the 
other triphosphates the yield of products was substantially 
improved. In this instance T3 RNA polymerase incorporated 
rKI P better in place of CTP rather than in place of UTP (but see 
below P-TAR transcripts. Fig. 4 and refs 1 .3). T7 and SP6 showed 
a marked preference for replacing UTP rather than CTP with rPTP 
Regulation of HI V- 1 transcription is controlled by a specialised 
KNA/protem interaction (for review see 19). The f/w/.v-acti vation 
responsive region (TAR) of HIV-I is located immediately 
downstream of the HI V- 1 transcription start site from positions +'l 
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Figure X Agarose gel electrophoresis of ihe products of RNA transcription 
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polymerases T7. T3 and SP6 are shown. Above each lane, the omission of either 
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concentration of the other NTPs (+ or + IOxP> is recorded 
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Figure 4. TAR transcripts from wild-type DNA using rPTP to replace UTP 
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exposure to lllm lor 30 min and after 2 h where it can he seen that rPTP can he 
used to entirely replace Ihe pyrimidine triphosphates. 



to +59, and is therefore transcribed immediately upon initiation 
ot transcription from the HIV promoter (20). TAR is known to 
form a highly stable stem-loop and it has a tripyrimidine bul«e 
near the apex of the structure (Fig. 5) (2 1 ). TAR RNA is bound 
by the virally-specilled //ww-activator protein Tat (22). This 
leads to a dramatic increase in HI V gene expression. The Tat/TAR 
interaction is very specific and mutations in and around the 
tripyrimidine bulge abolish both Tat binding and //v//f.v-activation 
of HIV transcription (16,23). The bindina of Tat to TAR is 
decreased by the mutation G 2o :C 3 « to C:G (mGC). A shorter 
Tat-derived peptide (ADP-I ). which contains the core and basic 
regions of Tat (residues 37-72), has been shown to exhibit similar 
specificity to the wild-type protein (16,23). 

Both wild-type and mGC TAR transcripts were made usin» the 
bacteriophage T3 polymerase to assess the efficiency of 
incorporation ot rPTP in place of either UTP or CTP. When 
transcription was carried out using rPTP to replace UTP full 
length RNA transcripts were readily obtained. When replacing 



CTP, the yield of transcript was markedly reduced (Fi<:. 4), but 
full length transcripts could still be visualised on gels even when 
both CTP and UTP were replaced by rPTP. Although the reason 
lor this low efficiency of incorporation into a smaller transcription 
product is unclear, the fact that TAR RNA is a highly ordered 
structure may provide a possible explanation. 

In order to characterise the transcripts, lame scale reactions 
were carried out. TAR RNA itself has been shown to have a hi<m 
melting temperature, 65°C in 10 mM potassium phosphate, 
M) mM sodium chloride (24) consequent on its hairpin structure, 
and this correlated with our findings (T m in 100 mM sodium 
phosphate, 7S°C). The transcripts containing rPTP in place of 
UTP also had a high melting temperature W (76°C). This was 
unexpected in that P:G base-pairs in DNA duplexes are -2°C less 
stable per modification than a C:G base-pair (8); there are also 
two U:G base-pairs in the native TAR RNA which will be 
replaced by P:G base-pairs in the analogous P-containinc TAR 
(hereafter called P-TAR). However, the derived stacking enthalpy 
was lower than that of the native TAR; these results are shown in 
Figure 5. Similar findings were obtained with the mutant 
(G: 6 :Cu; to C:G) TAR transcription products. The two P-TAR 
transcripts also have the same melting temperature and very 
similar stacking enthalpies. The TAR transcripts were also 
digested with snake venom phosphodiesterase and alkaline 
phosphatase to determine their nucleoside composition. The 
P-TAR transcript was shown to contain only C, G. A and P, 
though the exact composition could not be accurately determined 
(data not shown), and therefore it was not possible to show 
unequivocally that rPTP had also been incorporated in place of 
CTP, although it might be expected that it would be, albeit at a low 
Irequency and randomly. 
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Figure 6. CD spectra of wild type TAR RNA ;md the P-TAR RNA. 



Circular dichroism measurements were carried out on the 
transcripts. The native TAR RNA had a CD spectrum similar to 
that previously described (15.24); the P-TAR produced a rather 
similar spectrum (Fig. 6), although there was a decrease in 
intensity of the 265 nm hand accompanied by a red shift in the 
crossover and a decrease in intensity of the far UV region bands 
present in the native structure. Spectra for the mGC mutant TAR 
and its P-derivative were similar (data not shown). 

Binding of the Tat fragment, ADP-1. a fragment of the tat 
protein (residues 37-72), to the modified TAR transcript was 
studied. Wild-type TAR RNA, labelled with [a- 32 P|GTP was 
synthesised using either UTP or rPTP. The mutant mGC TAR was 
prepared as a control. RNA bandshift assays were performed in 
the presence of increasing concentrations of the tat fragment 
ADP-I (0-1000 nM). The modified P-TAR ran with a slightly 
lower mobility due to the PMP residues; this altered mobility has 
also been observed with oligodeoxyribonucleotides containing 
dP. The mutant mGC TAR is known to have a 9-fold lower 
affinity for ADP-I than the wild-type (16). Figure 7 shows that 
the affinity of P-TAR for ADP-1 is lower "than that of the 
wild-type TAR but, remarkably, is higher than that of the mGC 
TAR mutant, and this was confirmed using densitometry 
measurements. The anefactual second shifted band, known to 
occur at high concentrations of ADP-1 peptide (16), is also seen 
with the P-TAR transcript. 

DISCUSSION 

The ribo-P-5'-triphosphate analogue was incorporated into RNA 
by all three RNA polymerases T3 and T7, and to a lesser extent 
for SP6. In the control system (pGEM Express Positive control 
template) rPTP was successfully used to replace entirely one of 
the two natural pyrimidine triphosphates to obtain full length 
transcripts (1.5 kb). In the TAR system using T3 polymerase it 
proved to be more difficult to obtain a high yield of product when 



using rPTP in place of CTP. Nevertheless, it was possible to 
obtain transcripts when both pyrimidine triphosphates were 
replaced by rPTP if the rPTP concentration was raised 10-lbld. 
When rPTP was used to replace UTP there was a decrease in net 
synthesis between this and RNA derived by using the four natural 
triphosphates, but the yield was acceptable. 

The replacement of UTP in TAR RNA by rPTP affects a 
significant number of base-pairs. There are eight U:A base-pairs 
in addition to the three uridine residues involved in the bulge and 
the single uridine in the hairpin. It has been shown that of these 
U23. the first base in the bulge, is essential for recognition by 
either Tat protein or the peptide ADP-1 (25). C 5 -Substituents are 
tolerated, and therefore it might be expected that the analogue P 
could be accommodated in this position. Remarkably, binding in 
the band-shift assay was only reduced by a factor of four 
compared with the wild-type TAR. In contrast, the biologically 
inactive mutant (mGC) showed 15-fold reduction in binding 
(Fig. 7), consistent with the earlier finding that mGC binding was 
reduced 9-fold (16). It is therefore clear that the P-TAR does~bind 
to ADP-1, demonstrating that P can be tolerated not only 
throughout the TAR structure, but more importantly in replacing 
the essential uridine at position 23. The observations provide a 
prima facie case for assuming that the proposed U23:arginine 
interaction in the TAR:ADP- 1 complex (26) can be mimicked by 
a P23:arginine contact. Overall the experiments indicate that 
P-TAR has a very similar structure to the wild-type TAR 
transcript and strongly suggests that it can undergo the large 
conformational change that occurs on binding (27). 

In addition to the nine U:A base-pairs there arc also two U:G 
base-pairs in the native TAR RNA stem. In DNA duplexes the 
P:G base-pair is in rapid chemical exchange between Watson- 
Crick and wobble configurations with a very low free energy 
difference between them (28.29). Evidently the U:G base-pairs^in 
TAR are correspondingly replaced by P:G with very little effect 
on structure. 

The similarity between the wild-type structure and P-TAR was 
further demonstrated by the fact that they have similar melting 
temperatures, and CD spectra. They also showed virtual identity 
in r m . stacking enthalpy and CD spectra. 
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Earlier work on the 2'-deoxynucleotide analogue of P (dPTP) 
demonstrated that it had very similar kinetic parameters of 
incorporation to those of TTP (10). Further work is in progress to 
investigate the kinetics of incorporation of the ribo-P triphosphate, 
and the hybridisation properties of rP-containing oligomers as 
well as their templating properties. 

Work during recent years has shown that interactions in RNA 
secondary and tertiary structures are much richer than those in 
DNA and that correspondingly RNA-protein interactions are 
likewise complex and only now beginning to be understood. We 
have used the Tat (ADP-I)/TAR system purely as a model to 
examine the incorporation of rPTP into a biologically active RNA 
oligomer, without necessarily expecting to leam anything new 
about the Tat/TAR interaction. Nevertheless, our findings have 
shown that incorporation of rPTP into TAR has had very tittle 
effect on its physical characteristics and biological activity. We 
are therefore in the process of examining the effect of site specific 
incorporation of P by chemical synthesis into TAR to examine its 
effects more rigorously. We believe that these aspects of RNA 
chemistry can be modulated by the introduction of hydrogen bond 
degenerate base residues which, in turn, should lead to a variety 
of novel outcomes and applications. 
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ABSTRACT 

2,6-Diaminopurine (DAP) is an analogue of adenine 
which can be converted to nucleotides that serve as 
substrates for incorporation into nucleic acids by 
polymerases in place of (d)AMP. It pairs with thymidine 
Jor urac.O, engaging in three hydrogen bonds of the 

Z JJh ? VP Jr ThS feSUlt 0f DAP '"corporation is 
to add considerable stability to the double helix and to 
impart other structural features, such as an altered 
groove width and disruption of the normal spine of 
hydration. DNA containing DAP may or may not be 
recognized by restriction endonucleases: RNA 
containing DAP may not engage in normal splicing. 
The DAP-T pair affects the local flexibility of DNA and 
impedes the interaction with helix bending proteins 
By providing a non-canonical hydrogen bond donor in 
the minor groove and/or blocking access to the floor of 
tha groove it strongly affects interactions with small 
molecules such as antibiotics and anticancer drugs 
Examp es which illustrate altered recognition of 
nucleotide sequences in DAP-containing DNA are 
presented: changed sites of cutting by bleomycin 
photocleavage by uranyl nitrate and footprinting with 
m.thramyc.n. Using DNA in which both A^DAP and 
G->lnos,ne substitutions have been made it is possible 
to assess precisely the role of the purine 2-amino 
group in ligand-DNA recognition. 

INTRODUCTION 

It was exactly half a century ago when the first modified 
nucleobase. 3-methylcyto.sine (5-MeC). was discovered (l) 
Nowadays. ,t , s well established that this unusual but naturally 
occurring base participates in the control of gene expression in 
higher organisms (2). Over the las, 50 years many other modified 
ba es have been discovered in DNA. One of the most interesting is 
--aminoadenine (abbreviated to DAP or D for 2.6-diaminopurine) 
which is used m place of adenine by the cyanophage S-2L (3.4). 



u hose very existence shows that DAP in DNA is compatible with 
normal DNA function (5.6). 

T! 1 a t A u P ' T baSe pair P° ssesses an extra hydrogen bond compared 
with AT because of the additional -NH-. pointing toward the minor 
groove of DNA (Fig. 1). DAP is a common tool in nucleic acid 
chemistry which can be used to sa.dv molecular recognition 
between DNA or RNA and ligands. both small and large. In this 
paper, various applications of DAP are briefly presented In 
particular, the emphasis is on how this base can be extremelv useful 
in investigations on the structure of nucleic acids as well as 
sequence-specific interactions between DNA and small molecules 
or proteins. 

CHEMICAL AND ENZYMATIC SYNTHESIS OF 
DIAMINOPURINE-CONTAINING DNA 

In most cases, the DAP base is introduced chemicallv into DNA 
sequences using conventional phosphoramidite chemistry. Svnthesis 
of the DAP nucleoside phosphoramidite has been described (7) 
Alternatively. DAP can be incorporated into DNA bv enzymatic 
methods via the u.se of DAP triphosphate and polymerase;. The 
triphosphate of 2-aminoadenosine acts as a true analcue of ATP in 
transaction (S). 2,6-Diaminopurine-2'-deoxvribonucleoside is 
commercially available and methods have been described to convert 
it into dDTP. The synthesis involves convening 2-amino-2'-deoxy- 
udenosine to its 5 '-monophosphate derivative "dDMP followed bv 
pyrophosphorylation (9.10). Other chemical routes have been 
reported (1 1.12). Alternatively, the triphosphate nucleotide dDTP 
may be produced biotechnologically directly from the base 
precursor DAP. as is the case with the related nucleoha.se 
--aminopunne. When added to growing bacteria. 2-aminopurine is 
metabolized to form deoxy-2-aminopurine triphosphate (dAPTP) 
( -J) which, like dDTP. serves as a substrate for DNA polymerases 
(14.13). It is known that DAP. which is toxic to cultured ceils is 
normally metabolized to DAP ribonucleoside and then deaminated 
to guanosme (16). DAP and its 2'-deoxvnboside (DAPdR) exert 
their toxicity by different mechanisms. DAPdR. but not DAP. acts 
as a precursor of deoxyguanosine in mammalian cells ( 17). The 
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Figure 1. Structure of a diaminopurinc-ihymidine base pair (Watson-Crick 
pairing). The positions of the minor and major grooves are indicated. The 
2-amino group which distinguishes a DT pair from an AT pair is printed in red. 
Broken lines represent hydrogen bonds. The molecular model was built with the 
programs HyperChcm 5.1 and Alchemy 2000. The guanine 2-amino group is 
shown as a blue sphere. 



polymerization of chemically synthesized DTP by deoxynucleotidyl 
transferase from calf thymus has been reported (18), 

DIAMINOPURINE INCREASES THE THERMAL 
STABILITIES OF DUPLEX DNA AND RNA 

DAP is frequently introduced into nucleic acid sequences to 
increase the melting temperature of DNA (19) and/or RNA 
duplexes (20) and for related applications such as primers for 
sequencing and PCR (2 1 ,22) and fingerprinting (23). Incorporation 
of DAP into short DNA oligomers increases The thermal stability 
oftfne duplex by 0-2 °C per DT base pair (24). Prosnyak et ai. 
(23) reported that the melting temperature (T m ) of a given ofigo 
or polynucleotide containing y% of DAP is increaseefby a factor 
of 0.14v (AT m ). However, dependence of the 7" m on "the DAP 
content is not linear (25.26). For a 160 bp DNA fragment having 
all A replaced with DAP residues on both strands ofthe t ran mem, 
we measured T m values of 65.9, 11 A and 78.6°C. whereasrtie T m 
values for the corresponding DNA fragment containing natural 
bases were 62.7. 70.0 and 72.SX 111). The T m elevation 
resulting from introduction of a 2-amino -roup onto A residues 
is much smaller in the deoxy series than in the ribo series n> in 
the deoxynbo series the stabilizing contribution arisina from the 
formation ot a third hydrogen bond in D T pairs is opposed bv a 



destabilization due to disruption ofthe spine of hydration in the 
minor groove of B-fonm DNA (2S). 

The potential uses of DAP are wide ranging. For example, it has 
recently been used in conjunction with 2-thiothymine to produce 
selectively binding complementary (SBC) oligonucleotides 
which can facilitate the formation of particular DNA structure 
(e.g. three-arm junctions) via strand invasion (29). DAP, like 
2.4-diaminopyrimidine, can be used for the development of 
components of an extended genetic alphabet (30). Very recently. 
Nielsen and co-workers reported that incorporation of DAP 
nucleobases into peptide nucleic acids (PNA) increased the DNA 
binding and sequence discrimination of PNA oligomers (31). 

USE OF DIAMINOPURINE BY DNA POLYMERASES 

Incorporation of nucleoside triphosphate analogues by polymerases 
is a method of choice to examine miscoding by different DNA 
polymerases. For example, 8-oxo-dGTP and' 8-amino-dGTP 
were employed in a recent study comparing the efficiency of 
utilization of these modified bases by HIV type- 1 and murine 
leukemia virus reverse transcriptases with that of mammalian 
DNA polymerases (32). The HIV-1 reverse transcriptase readily 
accepts non-canonical bases, as well as nucleoside triphosphates 
modified on the sugar (33). Lutz etui. (30) have shown that the 
AIDS virus enzyme successfully incorporates the triphosphate form 
ot 2.4-diaminopyrimidine opposite 2'-deoxy-7-deaza\anthosine. 
whereas the same reaction failed completely with calf thymus 
DNA polymerases or with the Klenow fragment of Escherichia 
coli DNA polymerase I. As regards DAP, it is a good substrate for 
a number of polymerases. Substantial changes in the minor 
groove of DNA do not disrupt recognition contacts with T3 or T7 
RNA polymerases. Apparently, there is no interaction between 
RNA polymerase and the guanine 2-amino group (34.35). Heat 
stable polymerases can also function with modified base-containing 
nucleoside triphosphates. dDTP is readily accepted by Taq 
polymerase and related enzymes. We have shown that incorporation 
of the diaminopurine nucleobase into both strands of a 160mer 
fragment presents no particular difficulties (36). 

STRUCTURAL STUDIES OF DIAMINOPURINE- 
CONTAINING DNA 

Structural studies of DNA have also profited from die use of DAP. 
Crystal structures of the hexanucleotides d(CGUDCGh. 
d(CGTDCG)2 and d(CDCGTG)2 revealed that substitution with a 
central D-U or DT pair is consistent with presumed Z-DNA 
formation (37,38). TTie net effect of adding an N2 amino group to 
the C2 carbon ofthe adenine base in a T-A base pair is to render the 
minor groove of both B- and Z-DNA more hydrophilic. A I thou un 
the effect of the added exocyclic amino group on the stability of a 
Z con former is greater than for a B con former, both the Z and B 
structures have a C2 carbon that becomes almost entirely inaccess- 
ible to solvent upon addition of the N2-amino group ( 39). Structural 
studies with DAP-containing oligonucleotides (and other modified 
bases, in particular inosine) have shown unambiguously that the 
N2-amino group is critically important in defining the stability of 
Z- versus B-DNA. It is interesting to note that with the Z-form 
oligonucleotide d(CDCGTG). just as with the hexanucleoiide 
d(CGCGCG). the continuous spine of water molecules in the minor 
groove crevice is not disrupted, suggesting that these sets of water 
molecules help to stabilize Z-DNA conformation (37.39). 



The effect of ,he purine 2-amino g roup on the preferred 

under JtZ L? T' ^ formation occurs 

P , Iv T.! u 3) - The putatlve A - f orm of poly(dD-dT) is 
smb, lized by the methyl group at position 5 of the pynmidine ba e 

aupiex d(OCATTATTGC) and its analogue having all A replaced 
by D mdica.ed that the A-+DAP substitution doeTnot disturb e 

^r^sr" of ihe DNA dup,ex - at - 

nucleic acids. The extent of photocleavage of double-stranded 

moduhrr"" H ° n . S " aC ' diC PH (- 6 ( ^-5) exhibits a very stron" 
modulation wh.ch ,s correlated with minor uroove width/ 

(46) " ^ ° f ™-r groove 0^ 
UNA hel x is o a first approximate correlated with its AT/GC 
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groove, whereas GC-tracts have a widened minor sroove Studies of 
ur^y -mediated cleavage of DNA containing DA^residu« h ° 
revealed that the variation of the minor groove width ShS to 
nudeoc.de composition can be attributed to a lanre extent oT 
presence of a punne 2-amino group on GC base pair, 

As shown m Figure 2, the reactivity towards uranvl nitrate at 
acid pH ,s modulated in the DAP-containins DNA quke diffeltlv 
from natural DNA. consistent with a marked wideS > a s ™ of 
7*9 P "Placement The interpretation is clear, that A^DAP 
substitutes markedly affect the minor groove width o f DNA T 

, ^v'r gr ?l e "'^ m ^ evident - ^ patterns of 
suscepnbihty of these two DNA species to DNase I cleavage™ 7) 
The exocychc ammo group of G or D p| ays a significant role 
in the mtnnsic curvature of DNA. Gel electrophones s s di 
employing a senes of oligonucleotides 5'-AAAAAGCCGC V 

XidS 2 d ff dU£S WCre SyStematiCa "' S ^ with D 
or t residues at different positions indicated that the cumtun* 

induced by an A-tract in DNA molecules is primal tot Z Z 

the junction with the 3'-end of the A-tract (48 49) 
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DNA all ll I T "wsponds to that of the normal DNA. In the DAP 
UNA. all adenmc residues arc replaced by diaminopurine residues (47) 



functiSn UR ' NE AND RNA structur e and 

The nucleotide analogue interference mapping assay is a sensitive 
- Jod bywh to identify and determine^he exact on Zion 

a a iv tv ^TTt S f roU H PS With J n RNA Which are essentia ' fo S 
the rl r The * chn,a - ue has gently been used to investigate 
he role of every N2-exocycIic amine of C within a lar* e RN A 
he Tetrohy.nena group [ intron. using diaminopurine riboSde 
tuds ingc.u ao5D!e pa|rs , n a vane(y Qf rna mo 

and tor investigatrng spliceosome assembly. Antiscnse probes 
incorporating DAP are efficiently able to select RNP IS 
which wou,d otherwise be inaccessible <2» .n s *Z " m 
DAP may improve antisense activity. ee ' 

DIAMINOPURINE AND PROTEIN-DNA RECOGNITION 

The : l-amino group of guanine residues is the only hvdro»en bond 
donor group exposed in the minor amove of DNA In t .v 
« -Pedes access to the floor of ^ro^t^'Z 



the spine of hydration. As such, it could be expected to nlav a 
determent ro,e in the recognition of DNa' s quenc ' by 
proteins, peptides and small molecules Y 

mini u trUCtUrC ° F lhe res[ric 'ion endonuclease £ C0 R1 

wmpfcxed to the tridecamer d,TCGCGAAITCGCG) co^Z 
<h recognition sequence (underlined) reveals that the enzvme 

S sSS h'J the d0 K Ub,e hdix main,y in [he ™i<*££ 

Oi). Specific hydrogen bonding interactions can be detected 
between the enzyme and the 06/N7 of guanine and NcVWof 

nvo lTnle^- Sit£ - ^ * "° ™™ 1 ***** '"S 
Z ex ? £ ™ *T° Sr ° UP ° f SUani ' ne in min0r 8™* of 
2 if ,™ ^? C r n SCqUenCe - AS 3 reSult " one would an^ipate 
. i ,f dn y- Perturbation of EcoRl cleavase of modified DNA 
duplexes containing bases with altered 2-amino S roups. Sevenhe 
ess vanoas experimental studies have shown that replacement of 
deoxyguano.s,nes or deoxyadenosines in 5'-GAATTC with deox° 

chT,e S , h ° r de ° Xydiamin °P urines - respectively, can significantly 
change the enzymat.c activity of EcoRl (54.55) For example 
substitution with DAP at position 3. d(GADTTC), resu JedTn a 
9-fold decrease ,n the specificity constant (54). The use of 
UAH-contaimng substrates has provided valuable clues to the 
mechanism by which £ co R[ and related enzymes (e.g. Rsrl 56) 
recognize the duplex sequence GAATTC. Similar results have 
been reponed with other restriction endonucleases that mainly 
bmd and cut via the major groove of DNA ( 1 0.57). DAP has also 
been used to study the kinetics of DNA methvlation by the EcoRl 
modification methylase and Dam methyltransferase (58.59). 

1 hus although EcoRl can be considered essentially as a major 
groove-binding protein, the nuclease is evidently also sensitive to 
me functional groups exposed in the opposite minor sroove. The 
same conclusion was reached when we studied the" interaction 
between the factor for inversion stimulation (FIS) and DNA 
containing inosine and/or DAP residues. FIS is a major aroove- 
binding protein from E.coli required for several processes, 
including site-specific recombination, transcriptional activation 
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and DNA replication (60). We have shown that base substitutions 
which alter the placement and presence of the purine 2-amino 
group m the minor groove can affect both the intrinsic curvature 
and the nonliability of DNA and therebv modulate the interaction 
with proteins like FIS, whose binding sites lie purelv within the 
major groove of the double helix (6 1 ). Two other proteins whose 
interaction with DNA is .strongly affected bv A-^D and G-W 
replacements are HMG-D. a member of theHMG-l familv of 
chromosomal proteins (62). and the integration host factor (IHF) 
trom E.coli. a small protein which binds preferentially to AT- rich 
sequences. Binding studies with a series of 41 bp oligonucleotides 
containing adenine analogues revealed that the interaction of IHF 
with the double helix involves contacts within both the minor and 
major groove. A-»D substitution within the bindin- re-ion 
considerably reduces the protein binding aff.nitv (up to'sO-fold) 
.6 j). For both HMG-D and IHF. the effect is attributable to an 
indirect influence mediated via helix defomiabilitv 



DIAMINOPURINE AND DNA REPAIR 



Owing to the structural similarities between DT and GT pairs 
DAP has been used to study the repair of GT mismatches' 
Experiments using cell-free extracts and 45 bp oligonucleotides 
containing D T pairs at defined positions suggest that the repair 
mechanismoperating on D T pairs may be thelame as the human 
G T repair pathway (64). However, more recent in vitro studies 
have revealed marked differences as regards the extent of incision 
by human thymine glycosylase of 45 bp heteroduplexes bearin« 
G T nuspairs or D T pairs (65). 

DRUG-DNA RECOGNITION STUDIES USING 
DIAMINOPURINE 

About 30 years ago the idea was first proposed that the 2-amino 
group of guanine is a significant determinant for the binding of small 
molecules within the minor groove of the double helix. Cerami et at 
(66. studied binding of the antitumor drug actinomvein (an 
antibiotic that remains extensively used in cancer chemotherapy) to 
poly(dI)-(dC) and to a synthetic analogue of P olv(dA-dT)(dA-dT) 
containing DAP bases partly or wholly replacing'the adenine bases 
This was the first experiment using polydeoxynucleotides in which 
the purine 2-amino group was deleted (C-»I substitution) or added 
o adenine residues (A-*D substitution). Later, short oligonucleo- 
tides contam.ng DAP residues were used as substrates for DNA 
b.ndingArleavage studies. Using hexanucleotides possessing A T 
OL IC or DT pairs. Sugiyama et al. (67) investigated the 
mechanism ol DNA cleavage by the antibiotic neoca^inostatin 
oaotf al. (6S) also used a hexamicleotide, d(CGTDCG)-> to refine 
me \.ray structure of the covaJent formaldehvde-mediated complex 

SSeSART^^ dUP ' eX ^ *« d ™^''" 
Since the pioneer work of Cerami « al. (66). the inlluence of 
al*™ H f " H ' P , f Uani " e 0,1 dru S"DNA recognition has 

un enain , C " ,,cal - bW itS Cxact r °' e h ^ rc ™"^ 

eludl H , y r em ' y ' hat itS Precise 'unction has been 
b at rr ?r . roU * h , 1C , C ° mbincd USe '"' PCR ^hnolony and 
d en,in h ""l^f ln ^^- which enabled 'us to 
co nZ , l ° What eX,e,U ,he 2 " ;lmi no aroup of -uanine 



Sequenee-specillc cleavage of DNA by the antitumour ami- 
bHU.cs bleomycin and calicheamicin Y ,< is stron.lv dependent on 
he posmon of the purine 2-amino group. For bleomycin 
relocating the 2-am.no group from guanine to adenine nucleotides' 
creates new cleavage sites at pyrimidine residues K in- V of DAP 
residues For calicheamicin. the presence of a purine 2-amino 
group adjacent to the cutting site potentiates the cleavage reaction 
). i tie .-ammo group also constitutes a kev structural element for 
sequence-specific recognition of DNA by 'non-covalent binders 
hTespect.ve of their mode of interaction with the double helix such 
OC-spec.fic antibiotics as mithramycin and chromomvein find new 
binding sues associated with DAP-containing sequences in 
(I-DAP)-subst.tuted DNA and are excluded from former canonical 
sites containing I-C base pairs. The converse was found to be the 
case for a group of normally AT-selective ligands which bind in 
the minor groove of the helix, such as netropsin. berenil and DAP! 
(jo 6../0). The binding sites of almost all DNA-bindin« dru-s 
and antibiotics strictly follow the placement of the purine 2"-amino 
group, which serves as both a positive and negative effector (69) 
To illustrate the effect of DAP residues on dru«-DNA 
recognition, a footprinting gel obtained with the antitumour 
antibiotic mithramycin is presented in Figure 3. The effect of 
shifting the purine 2-amino group from guanines to adenines (by 
virtue ol combined A-+D and G-»I substitutions) is to provoke 
a signilicant redistribution of binding sites for mithramycin such 
that the newly created sites containing D T pairs are substantially 
preferred over GC-containing sites ( 36 ). For example, the stron« 
footprint around position 100 with normal DNA in the presence 
of mithramycin is totally absent with the modified DNA species 
and. conversely, the footprint around position 87 with I+DAP 
DNA corresponds to a region of enhanced DNase I cleava-e with 
normal DNA. Also, the strong footprint observed around position 
73 with normal DNA is missing in DNA containing both I and 
residues. There is no doubt that the pattern of bindin- sites 
for mithramycin is radically changed. The drug is displaced from 
us GC sites ,n natural DNA to pick up new sites in the 
UAP-r-nch sequences created in the modified nucleic acid 

The most pronounced effects attributable to DAP were observed 
with the quinoxaline antibiotics such as echinomycin and triostin A 
(7 1 .72). The A-+D substitution potentiates enormously the interac- 
tion of these two drugs with DNA. up to l(XX)-fold. With normal 
a concentration in the 10-20 uiM ranuc is needed to evidence 
strong binding to CpG sites. With the DAP-containing DNA 
footprints are already very pronounced at only 0.5 fiM dru» and 
binding to newly created TpD sites can be unambiguously delected 
at a concentration as low as 10-20 nM (72). This "enhancement of 
binding is only seen with the naturally occurring quinoxaline 
antibiotics and does not occur with the synthetic analogue 
lAiNDt.M. which recognizes AT-containing sites and is totally 
insensitive the relocation of the exocyclic amino group 1 72) In this 
case, the binding specificity may arise primarily from stacking and 
hydrophobic interactions rather than from direct contact with the 
exocyclic guanine 2-amino group (72). With other druas. such as 
actinomvein. the extent of binding to DAPT sites is "essentially 
unchanged compared with normal DNA (71). There is something 
special about regions of alternating TD base pairs which generates 
unusually good binding sites for echinoinvcin and triostin A. The 
Meal structure and/or the rigidity of the TpD sites could be exploited 
by the drugs to lit particularly neatly within the minor amove. 
Alternatively, the stacking of their quinoxaline rings upon DAP T 
base pairs could be especially propitious 1 73;. 
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Figurc 3. DNasc I footprinting of miihramycin on ihe Crick (sense) strand 
of normal and inosine + DA P-con raining rvrTf A93j DNA. The products of 
DNase I digestion were identified by reference to the Ma.xam-Cilben 
markers (lanes T+C and G+A). Control lanes (Ct) show the products 
resulting from limited DNase ! digestion in the absence of druc. The 
remaining lanes show the products of digestion in the presence of the 
indicated antibiotic concentrations f expressed as micromotor). The scale on 
the left corresponds to the standard numbering of the atT DNA as 
represented in Figure 1 



There are very few cases where the replacement of adenine with 
DAP has little or no effect on the sequence selectivity associated 
with drug binding. The footprinting profile of the 'threading 
mtercalator nogalamycin is potentiated in DAP+I-substitutcd DNA 
but otherwise remains much the same as with natural DNA (36). 
Studies with a series of antitumour bis-naphthaJimide derivatives 
which bis-intercalate into DNA sequences, particularly those 



containing CpT (ApC) and TpC (CpA) steps, showed that 
reposmonmg the 2-amino group of C-C base pairs bv substitution 
with inosine and/or DAP had little effect on the distribution of 
drug molecules between binding sites. In contrast to nearly all 
common (bis)mtercalating drugs, the bis-naphthalimides appear 
to engage in contacts with the edges of the base pairs via the maior 
groove of the double helix (74). J 

CONCLUSION 

After incorporation of DAP into DNA, the significant difference 
from adenine is that a 2-amino group is present in the minor 
groove (Fig. I ). The consequent modification of the surface of the 
minor groove results in altered conformational properties of the 
double helix, leading to altered recognition by proteins and small 
molecules. The D-T base pair is more stable than an AT pair. The 
reinforcement of base pairing reduces the flexibility of the DNA 
and thus generally reduces the extent of protein binding, at least 
for small proteins such as FIS, HMG-D and IHF. A-*D substitutions 
also decrease the capacity for binding within the minor groove of 
antibiotics such as netropsin and distamycin, whereas ~for other 
drugs, like the quinoxaiine antibiotics echinomycin and triostin, the 
DAP substitution promotes the recognition process considerably. 
It is likely that the larger D-T pair (having a dipole moment of 
-2.3) gives rise to better stacking interactions with an intercalating 
chromophore than the standard A T pair (dipole moment -1.7) 
(75-77). The changes in surface area and electrostatic properties of 
the base pair may favour interaction with a planar chromophore but 
the strength of intercalation must also depend on the interaction of 
the attached groups (e.g. the peptide moiety of echinomycin or the 
sugar moiety of calicheamicin) with the sequences Hanking the 
intercalating site. The approach described here, which uses modified 
bases to study ligand-nucleic acid interactions, provides useful 
information, but one should bear in mind that the observed effects 
can arise from direct interaction between the Iigand and the newly 
introduced group on the base, as well as from indirect interactions. 
Analogue substitution, be it with DAP or another base, can always 
affect the position or properties of neighbouring bases that might be 
involved in direct interactions with the Iigand." 

In addition to the varied applications presented above. DAP has 
been used for the recognition of abasic sites in DNA (78), to 
investigate mechanisms of mutagenesis (79,80) and to select 
purine-rcsistant variants from mutagenized cultures of Drosophila 
(81). Thus, the potential applications of DAP as a nucleobase 
cover many modern aspects of nucleic acid chemistry from 
structure to biology. DAP represents a useful chemical product 
particularly well suited to increase the stability of double- 
stranded nucleic acids. At the same time it is a natural product and 
a key element of the genetic machinery of the S-2L cyanophage. 
The ability of polymerases to accept non-standard base pairs, 
such as DT pairs, is remarkable in the light of the physiological 
role that the polymerases play. Studies with DAP and other 
modified bases reinforce the idea that the Watson-Crick formalism 
can be extended while maintaining a high fidelity of DNA 
replication which is essential for preserving the integrity of living 
organisms. DAP-containing nucleic acids can be prepared in 
many synthetically convenient ways so as to produce different 
repertoires of molecules having a range of functionalities. There 
is good reason to believe that the successful use of DAP will 
continue to inspire the development of novel nucleic acid 
products and the emergence of new techniques. 
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"Abstract: The influence of the 2-amino group of guanine on antibiotic-mediated cleavage has been studied using 
DNA in which that group has been removed from guanine, added to adenine, or both. A homologous series of 160 
base pair fragments of DNA containing inosine and/or 2,6-diarrunopurine residues in place of guanosine and/or 
adenine residues respectively were synthesized by the polymerase chain reaction and subjected to sequence-specific 
cleavage by the iron -bleomycin complex or calicheamicin y, 1 . Although the 2-amino group is not absolutely required 
it constitutes a key structural element which directs sequence-specific cleavage of DNA. For bleomycin, relocating 
it created new cleavage sites at pyrimidine residues lying 3' to 2,6-diaminopurine residues. For calicheamicin, the 
presence of a purine 2-amino group adjacent to the cutting site potentiated the cleavage reaction. Sequence recognition 
by bleomycin seems to rely on direct interaction with guanine whereas DNA conformation/flexibility appears more 
important for calicheamicin. 



A large number of potent anticancer drugs owe their efficacy 
to their ability to promote DNA degradation. Such is the case 
for the bleomycins and the calicheamicins (Figure 1), two 
families of glycoconjugate antibiotics capable of inducing DNA 
strand breakage via free radical mediated mechanisms. 1 " J 
Cleavage of DNA by bleomycin is dependent on the participa- 
tion of a redox-active metal ion and a source of oxygen 4 ' 6 
whereas calicheamicin and other enediyne compounds require 
a thiol cofactor. 7 ' 9 In both cases the cleavage reaction proceeds 
via an attack on deoxyribose by highly reactive species produced 
upon chemical activation of the drug, but the molecular 
mechanisms of free radical generation and of DNA cleavage 
are different. The bleomycin-Fe" complex combines with 0 2 
to produce a reactive oxygenated metallobleomycin species 
which is capable of abstracting a hydrogen atom from the 
deoxyribose ring. 2 Bleomycin generates mainly single strand 
breaks whereas calicheamicin produces almost exclusively 
double strand breaks. 10 The chemistry of calicheamicin and 
other enediyne compounds has been examined in detail in recent 
years: a reducing agent (e.g. glutathione, dithiothreitol) acts as 

* Present address: Institut de Recherches sur le Cancer. INSERM Unite 
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Figure 1. Structures of bleomycin A; and calicheamicin y t K 

a nucleophile to initiate the reaction of the trisuifide moiety 
with the enediyne system which then undergoes a Bergman 
cyclization reaction leading to a DNA-damaging benzenoid 
diradical. 3 A feature common to both agents is the high 
sequence-selectivity of the DNA cleaving reaction: pyrimidine 
residues 3' to a guanine residue (i.e. CpC and GpT sequences) 
are preferentially cleaved by bleomycin"" 15 whereas cali- 
cheamicin selectively attacks a pyrimidine residue embedded 
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in a short homopyrimidine • homopurine tract. Double-stranded 
lesions produced by calicheamicin (the leading compound 
in the series) occur with a tetranucleotide specificity mainly at 
TCCT-AGGA, with other minor sites at short runs of pyrim- 
idines. e.g. TCCG, TCCC, TCTC. 16 " 18 

The pattern of cleavage is believed to result from prior 
sequence-selective binding of the antibiotics to DNA. t9 ~ :7 - :8 ~ 36 
Although the exact modes of binding remain controversial, 
molecular modeling and experimental studies have suggested 
that both antibiotics bind within the minor groove of the DNA 
helix where the 2-amino group of guanine may constitute a 
critical sequence recognition element. Models have been 
proposed which involve, for bleomycin, hydrogen bonding 
between one of its bithiazole nitrogens and the guanosine 
2-amino group 37 " 38 and, for calicheamicin, interaction between 
the polarizable iodine atom on its benzoate core and the 
exocyclic amino group of the 5' guanine in the sequence 
AGGA. 39 - 40 The bleomycin model appears consistent with 
footprinting studies on bithiazole-netropsin hybrid ligands, 41 
prompting the inference that interaction of the antibiotic with 
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Figure 2. Structures of hydrogen-bonded purine-pyrimidine base 
pairs. Broken lines represent hydrogen bonds. 1 represents inosine; DAP 
represents 2.6-diaminopurine (2-aminoadenine). 

the 2-amino group of guanine must somehow be responsible 
for the sequence specificity of DNA cleavage by the iron- 
bleomycin complex. By contrast, footprinting studies on 
calicheamicin y x x suggest that interaction with guanine is likely 
to be a minor factor in determining the sites of cleavage. 42 

To examine directly the influence of this substituent on DNA 
cleavage by the two antibiotics we have observed the reaction 
using DNA in which the purine 2-amino group has been 
removed, added, or shifted. Removal amounts to converting 
guanosine nucleotides to inosines (Figure 2). Addition can be 
accomplished by a modification such that the 2-amino group is 
present on all purines: this involves replacing the adenines with 
2,6-diaminopurines (DAP, Figure 2) while retaining it on 
guanines. Shifting the 2-amino group can be accomplished by 
relocating it on to the A • T base pairs (A — DAP substitution) 
while at the same time removing it from the G • C pairs (G — 
I substitution). Therefore, by comparing the patterns of cleavage 
with each of these DNA fragments we can observe the precise 
effect of the purine 2-amino group on the sequence-specific 
cleavage of DNA by bleomycin-Fe" complex and calicheami- 
cin. 

Results 

A series of 160 base pair tyrl(A91) DNA fragments 43 
containing normal or modified nucleotides were synthesized 
using the polymerase chain reaction (PCR), labeled on one or 
another of the complementary strands, and then subjected to a 
standard cleavage reaction in the presence of each antibiotic. 

Cleavage by Bleomycin. The cleavage pattern generated 
with the bleomycin-Fe ir complex varies markedly depending 
on whether the 2-amino group is present on guanine or adenine 

(42) Mah, S. C; Townsend. C. A.: Tullius. T. D. Biochemistry* 1994. 
33, 614-621. *■* 

(43) Drew. H. R.; Weeks. J. R.; Travers. A. A. EMBO J. 1985. 4, 1025- 
1032. 



Blm-Fe" 





-80 



-70 



-60 



-50 



-40 



-30 



-t- 




-50^ 

-60' 

■70 ' 
-80- 

-90- 
-100- 

110 



•120 



-130. 



Figure 3. Autoradiographs showing cleavage of normal and modified 
DNA by the bleomycin -Fe n complex (5 /<M). The left and right panels 
refer respectively to the labeled Watson and Crick strands of tyr T- 
(A93) DNA containing the four natural nucleotides (normal DNA), 
inosine residues in place of guanosine (Inosine DNA), diaminopurine 
residues in place of adenine (DAP DNA), or inosine plus DAP residues 
in place of guanosine and adenine respectively (I+DAP DNA). 
Chemical identities of the cleavage products were assigned by reference 
to formic acid-piperidine markers specific for purine residues (lanes 
G+A), corrected for the expected l — 1.5 band shift due to 5' 
end-labeling. 51 - 52 The G-f A track shown in the right pair of panels was 
determined with the I+DAP-substituted DNA so it is strictly an I+DAP 
track; it is characterized by much stronger bands at the DAP residues 
than the inosines, which helps to confirm the correctness of the sequence 
and the faithful incorporation of I in place of G and DAP in place of 
A. The scales on the sides of the autoradiographs correspond to the 
standard numbering of the tyr T(A93) sequence as represented in Figure 
4. 

residues or both (Figure 3). The differences between normal 
and modified DNA are summarized in the histograms shown 
in Figure 4. With normal DNA, the iron-bleomycin complex 
cuts most strongly at GpC and GpT sequences (the underlined 
nucleotide indicates the reactive site). Some ApC and GpA 
sites, plus occasional ApT steps, are cleaved as well. When 
every purine residue bears a 2-amino group (DAP DNA) 
cleavage can be observed at almost all GpC. GpT, DAPpC, and 
DAPpT sites, i.e. at nearly all 5 '-purine— pyrimidine (RpY) 
dinucleotide steps. Yet DNA fragments completely lacking the 
2-amino group remain susceptible to cleavage by bleomycin. 



Inosine DNA is cut best at 5'-ApC sites, with weaker cleavage 
at certain fpC and ApT sequences. With Inosine DNA the 
majority of the strong cutting sites have a T residue at position 
2 so it seems that 5'-TRYR sequences, particularly TACI, 
provide the preferred sites for cleavage by the antibiotic. 44 With 
the doubly substituted DNA containing I • C and DAP • T base 
pairs, the cleavage occurs principally at DAPpC and DAPpT 
sites. Occasional IpC sites are weakly cut by the iron- 
bleomycin complex. 

Cleavage by Calicheamicin. Calicheamicin cuts normal 
DNA at a restricted number of sites such as 5'-1TCA, TTCT, 
TTAC, TCCC, and TTTT (Figure 5). As expected, the cuts 
appear displaced asymmetrically toward the 3' ends of the 
complementary strands (Figure 6) due to binding of the antibiotic 
in the minor groove. Calicheamicin yV is known to cleave 
normal DNA via a pair of cuts separated by 2—3 nucleotides 
in a 3' direction such that at least one of those cuts occurs 
preferentially at certain sequences. 28-36 Normal DNA is gener- 
ally less well cleaved than modified DNAs under identical 
conditions. The difference is emphasized in Figure 7 which 
shows the results of a comparative concentration-dependence 
study using normal and doubly-substituted DNAs. Several sites 
of reaction with the two DNAs occur in much the same places 
but the intensity at a given antibiotic concentration is much 
stronger with the I+DAP DNA. This may in pan reflect the 
fact that the readable portion of the r>rT(A93) DNA autora- 
diogram lacks certain calicheamicin-favored sequences such as 
5'-T CCT, GCCT, TCCG, or TCTC Figure 5 reveals addition- 
ally that cleavage of inosine DNA appears much more uni- 
form: as well as strong reaction at a few canonical sites such 
as TTTT or TTCA there is significant non-specific cleavage 
indicating that removal of the 2-amino group is directly 
responsible for diminished sequence recognition by the enediyne 
compound. The results obtained with DAP and I+DAP DNA 
provide clues as regards the need for a calicheamicin— guanine 
interaction to promote a high level of DNA breakage. Cleavage 
at the underlined pyrimidine residue in the sequences 
TTCA-TGAA and TTCT • AGAA (positions 45 and 52, 
respectively) is abolished when the adjacent G*C pair is 
replaced by an I*C pair. Conversely, there are numerous 
sequences cleaved weakly or not at all in normal DNA which 
become exquisitely susceptible to calicheamicin attack when a 
2-amino group is introduced on to the adjacent base pair. Such 
is the case at positions 33, 113, 44, and 60 (TTAC * GTAA, 
CCTT • AAGG, GTTC • GAAC, and TAAC • GTTA, respec- 
tively). At the latter three sites the I+DAP DNA is strongly 
cleaved whereas the DAP DNA remains insensitive, as observed 
with normal DNA. It is also noteworthy that although cali- 
cheamicin y t 1 can cut homopyrimidine • homopurine sequences 
lacking a G * C base pair (such as those at positions 49 and 67, 
TTTT * AAAA and TTTA • TAAA, respectively) the extent of 
cleavage at such sites is considerably enhanced when a 2-amino 
group is introduced on to the purine residues of the comple- 
mentary strand (A — DAP substitution). 

Discussion 

These results lead to the following conclusions: (i) Adding 
a 2-amino group on to adenine residues (A — DAP substitution) 
is sufficient to create new cleavage sites for both antibiotics, 
(ii) Removing the 2-amino group from guanine residues (G — 

(44) The base adjacent to the cleaved dinucleotide can markedly affect 
the extent of damage. A pyrimidine on the 5' side increases cleavage 
efficiency while a purine decreases cleavage. The base on the 3' side ot 
the dinucleotide can also modify the cleavage intensity to a lesser extent. 45 

(45) Murray. V.; Tan, L.; Matthews. J.; Martin, R. F. J. Biol. Clwm. 
1988. 26.K 12854-12859. 
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Figure 4. Susceptibility of normal and modified DNA to cleavage by bleomycin. In the modified nucleic acids adenine and/or guanosine residues 
are replaced by diaminopurine and/or inosine residues, respectively. The relative cleavage intensity (in arbitrary units) at a given bond is expressed 
as a fraction of the total cleavage of all the phosphodiester bonds within the sequence. Quantitative analysis was limited to regions where peaks 
were sufficiently well resolved to permit unambiguous analysis. Data are compiled from quantitative analysis of three sequencing gels (including 
the gel shown in Figure 3) and must be considered as a set of averaged values. 



I substitution) does not abolish but reduces the DNA cleavage 
by bleomycin and greatly diminishes the sequence-specificity 
of calicheamicin yi'. Therefore the 2-amino group is a powerful 
determinant but not absolutely necessary for DNA breakage by 
both bleomycin and calicheamicin. (iii) Shifting the 2-amino 
group from guanine to adenine residues (A — DAP plus G -* 
I substitutions) produces a complete redistribution of the drug- 
mediated cleavage sites, showing that both antibiotics are 
extremely sensitive to the relocation of the purine 2-amino group 
in DNA. 

Thus the 2-amino group of guanine constitutes a key structural 
element in the mechanism of DNA cleavage by the iron— 
bleomycin complex as well as by calicheamicin y\ l . The results 
strongly suggest that upon binding to specific sequences the 
antibiotic molecule engages in contact with the 2-amino group 
of guanine exposed in the minor groove, consistent with 
proposed models. 37 " 41 In particular, the results reported here 
agree fully with recent NMR studies on the interaction of 
bleomycin-Zn and bleomycin-Co complexes with the oligo- 
nucleotides d(CGCTAGCG) 2 and d(CCAGGCCTGG) 2 , respec- 
tively. 3846 In both cases, the bleomycin molecule is engaged 
in direct contact with the 2-amino group of guanine, either via 
one of its bithiazole ring nitrogens (Zn-bleomycin) or via the 
methyl group of the pyrimidinyl moiety (Co-bleomycin). For 
calicheamicin, our observations are entirely consistent with the 
recent NMR experiments on the interaction between d(GG- 
AGCGC) • d(GCGCTCC) and the neocarzinostatin chromophore, 
another enediyne system, showing that the guanine exocyclic 

(46) Wu, W, ; Vanderuail. D. E.; Stubbe. J.; Kozarich, J. W.; Turner. C. 
J. / Am. Chem. Soc. 1994. 1/6, 10843- 10K44. 



amino group plays a critical role in stabilizing the binding of 
this drug to double-stranded DNA. 47 M However, it is clear that 
other aspects of DNA structure contribute to sequence recogni- 
tion, especially by calicheamicin. Base substitutions remote 
from the cutting site can significantly affect the extent of 
cleavage by bleomycin, 45 and it has been suggested that the 
flexibtlity/deformability of the pyrimidine strand is exploited 
by calicheamicin >V to distort the DNA structure so that the 
drug can fit within the minor groove. 49 " 52 The fact that the 
inosine-containing DNA, expected to be a good deal more 
flexible than normal DNA, provides an acceptable substrate for 
bleomycin and calicheamicin yi 1 cleavage is in accord with these 
ideas. Our data vindicate the hypothesis 49-53 that both DNA 
structure and interaction with guanine are involved in determin- 
ing sequence-specific cleavage of DNA by bleomycin and 
calicheamicin. 

Experimental Section 

Antibiotics, Chemicals, and Biochemicals. Blenoxane (a mixture 
of 60% bleomycin A;, 30% bleomycin B:, and 10% other bleomycins) 



(47) Gao. X.; Stassinopoulos, A.; Rice, J. S.; Goldberg. I. H. Biochemistry 
1995. 34, 40-49. 

(48) Sugiyama. H.; Fujiwara. T; Kawabata. H.; Yoda, N.: Hiravama. 
N.; Saito. I. J. Am. Chem. Soc. 1992. 114, 5573-5578. 

(49) Uesugi, M.: Sugiura, Y. Biochemistry 1993, 32, 4623-4627. 

(50) Walker, S. L.; Andreotti. A. H.; Kahne, D. E. Tetrahedron 1994. 
50, 1351-1360. 

(51) Man. S. C; Price. M. A.; Townsend. C A.; Tullius. T. D. 
Tetrahedron 1994, 50, 1361-1378. 

(52) Krishnamurthy. G.; Brenowitz, M. D.; Ellestad. G. A. Biochemistry 
1995. 34, 1001-1010. 

(53) Nightingale. K. P.: Fox, K. R. Nucleic Acids Res. 1993. 21. 2549- 
2555. 
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Figure 5. Cleavage of normal and modified DNA by calicheamicin 
yi 1 (0.35 //g/mL). The left and right panels refer to the 5'-end labeled 
Watson and Crick strands of nrT(A93) DNA, respectively. Other details 
as for Figure 4. 

was obtained from Lunbeck. Slock solutions were prepared in 10 mM 
Tris-HCl buffer containing 10 mM NaCl (pH 7.0), divided into 250 
/<M aliquots, and stored at -20 °C. Calicheamicin from Cyanamid 
was dissolved in ethanol to furnish a stock solution at 1 .75 Hg/mL. 
Ammonium persulfate, iris base, acrylamide, bis-acrylamide, ultrapure 
urea, boric acid, tetramethylethylenediamine, and dimethyl sulfate were 
from BDH. Formic acid, piperidine. and formamide were from Aldrich. 
Photographic requisites were from Kodak. Bromophenol blue and 
xylene cyanol were from Serva. Unlabeled deoxynucleoside triphos- 
phates, including dITP, were purchased from Pharmacia. The nucleo- 
side triphosphate labeled with [ )2 P\ (y-ATP; 6000 Ci/mmol) was 
obtained from NEN Dupont. 2,6-Diaminopurine deoxyribonucleoside 
triphosphate was obtained by phosphorylation of the corresponding 
nucleoside (Sigma) according to published procedures. 54 - 53 Restriction 
endonucleascs EcoRl and Aval (Boehringcr), Taq polymerase (Prome- 
ga). DNasc I (Sigma), and T4 polynucleotide kinase (Pharmacia) were 
used according to the supplier's recommended protocol in the activity 
buffer provided. The primers. 5'-AATTCCGGTTACCTTTAATC and 
5'-TCGGGAACCCCCACCACGGG having a 5'-OH or 5'-NH ; termi- 
nal group, were synthesized at the Laboratory of Molecular Biology. 
Medical Research Council. Cambridge. Checks were carried out to 
ensure that the primers blocked with a 5'-NH: group were free from 



(54) Ludwitt. I. Acta Uivclum. iiiophvs. Acad. Sci.. Hani;. 1981. 16. 131- 
133. 

(55j Secla. F.; Muth, H.-P.; Rtilinu. A. Helw Gum. Acta 1W1. 74. 554- 
564. 
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contaminants and not labeled by the kinase. AM other chemicals were 
analytical grade reagents, and all solutions were prepared using doubly 
dcionized. Millipore filtered water. 

Preparation, Purification, and Labeling of DNA Fragments 
Containing Natural and Modified Nucleotides. Plasmid pKMp27°- 
was isolated from £. coli by a standard sodium dodecyl sulfate-sodium 
hydroxide lysis procedure and purified by banding in CsCl-ethidium 
bromide gradients. Ethidium was removed by several 2-propanol 
extractions followed by exhaustive dialysis against Tris-EDTA buffer. 
The purified plasmid was then precipitated and resuspended in 
appropriate buffer prior to digestion by the restriction enzymes. The 
160 base pair tyrT{K9l) fragment used as a template was isolated from 
the plasmid by digestion with restriction enzymes EcoKl and Aval. It 
is worth mentioning that this template DNA bore a 5'-phosphate due 
to the action of EcoRl and thus only the newly synthesized DNA (with 
normal or modified nucleotides) can be labeled by the kinase. 

(a) Polymerase Chain Reaction (PCR). The protocol used to 
incorporate inosine and/or 2,6-diaminopurine residues into DNA is 
comparable to those previously used to incorporate 7-deazapurine or 
inosine residues with only a few minor modifications. 56-59 PCR 
reaction mixtures contained 10 ng of ryr I(A93) template, 1 f*M each 
of the appropriate pair of primers (one with a 5'-OH and one with a 
5'-NH; terminal group) required to allow 5'-phosphorylation of the 
desired strand, 250 /*M of each dNTP (dTTP, dCTP plus dATP or 
dDTP and dGTP or dITP according to the desired DNA), and 5 units 
of Taq polymerase in a volume of 50 pt\ containing 50 mM KC1, 10 
mM Tris-HCl. pH 8.3. 0.1% Triton X-100, and 1.5 mM MgCl : . To 
prevent unwanted primer-template annealing before the cycles began, 
the reactions were heated to 60 °C before adding the Taq polymerase. 60 
Finally, paraffin oil was added to each reaction to prevent evaporation. 
After an initial denaturing step of 3 min at 94 °C, 20 amplification 
cycles were performed, with each cycle consisting of the following 
segments: 94 °C for 1 min, 37 °C for 2 min, and 72 °C for 10 min. 
After the last cycle, the extension segment was continued for an 
additional 10 min at 72 °C. followed by a 5-min segment at 55 °C and 
a 5-min segment at 37 °C. The purpose of these final segments was 
to maximize annealing of full-length product and to minimize annealing 
of unused primer to full-length product. The reaction mixtures were 
then extracted with chloroform to remove the paraffin oil, and parallel 
reactions were pooled. Several extractions with water-saturated n- 
butanol were performed to reduce the volume prior to loading the 
samples on to a 6% non-denaturing polyacrylamide gel. After 
electrophoresis for about 1 h, a thin section of the gel was stained with 
ethidium bromide so as to locale the band of DNA under UV light. 
The same band of DNA free of ethidium was excised, crushed, and 
soaked in elution buffer (500 mM ammonium acetate, 10 mM 
magnesium acetate) overnight at 37 °C. This suspension was filtered 
through a Millipore 0.22 ttm filter and the DNA was precipitated with 
ethanol. Following washing with 70% ethanol and vacuum drying of 
the precipitate, the purified DNA was resuspended in the kinase buffer. 

(b) DNA Labeling and Purification. The purified PCR products 
were 5'-end labeled with [y- 3; P]ATP in the presence of T4 polynucle- 
otide kinase according to a standard procedure for labeling blunt-ended 
DNA fragments. 61 After completion the labeled DNA was again 
purified by 6% polyacrylamide gel electrophoresis and extracted from 
the gel as described above. Finally, the labeled DNA was resuspended 
in 10 mM Tris-HCl buffer at pH 7.0 containing 10 mM NaCl. 

Cleavage of DNA by the Bleomycin- Fe" Complex and Cali- 
cheamicin Y\ l . In a typical experiment, the freshly prepared bleomycin - 
Fe" complex (4 uL) was added to 6 fit of 5'-end labeled DNA (-1 
nM) in 10 mM Tris-HCl buffer at pH 7.0 containing 10 mM NaCl. 
The equimolar bleomycin -Fe complex consisted of 2 /<L of a 25 //M 

(56) Sayers. E. W.; Waring, M. J. Biochemistry 1993. 32. 9094-9107. 

(57) Marchand. C; Bailly. C: McLean. M. J.: Moroney. S.; Wanng. 
M. J. Nucleic Acids Res. 1992. 2<l 5601 -5606. 

(58) Bailly. C; Marchand. C; Waring. M. J. J. Am. Chcm. Soc. 1993. 
US. 3784-3785. 

(59) Bailly. C: Warinc, M. J. Nucleic Acids Res. 1995. 23, 8S5-S9-. 
' <60) Bloch.W. Biochemistry 1991, 30, 2735-2747. 

(6i)Manialis.T.; Fritsch. E. F; Sambrook, J. Molecular Cloning A 
Luhortttory Manual: Cold Spring Harbor Laboratory Press: Cold Spring 
Harbor. NY. 1982. 
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Figure 6. Susceptibility of normal and I+DAP DNA to cleavage by calicheamicin y, 1 . Details as in Figure 4. 
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Figure 7. Cleavage of normal and inosine plus DAP-containing DNA 
in the presence of increasing concentrations of calicheamicin y/. The 
reactions were conducted in accordance with the procedure described 
in the Experimental Section. The calicheamicin concentration in (z<g/ 
mL) is indicated at the top of each lane. 

solution of blenoxane and 2 /*L of a 25 fiM solution of Fe(NHa):- 
(SOj) : -6H;0 mixed just prior to the experiment. After incubation at 
room temperature for periods varying from a few seconds to 1 min the 



cleavage reaction was stopped by freezing. Samples were lyophilized, 
resuspended in 50 uL of water, and lyophilized again. The cleavage 
products were resuspended in 4 fit of formamide-dye solution and 
resolved on a denaturing polyacrylamide gel as described below. 

Two microliters of a 1.75-/<g/mL stock solution of calicheamicin 
y, 1 in ethanol were incubated with 7 /<L of DNA (M nM) in 10 mM 
Tris-HCl buffer at pH 7.0 containing 10 mM NaCl. Solutions of 
calicheamicin were prepared by serially diluting the master stock 
solution into ethanol. The final ethanol content in the reaction mixture 
was 20%. An equivalent volume of ethanol was added to the control 
tubes. The DNA-calichaemicin yi 1 solutions were equilibrated for 10 
min prior to the initiation of the cleavage chemistry. At the chosen 
time, I ftL of 1 mM dithiothrcitol was added and the reaction allowed 
to proceed for 5 min at room temperature. After precipitation with 
ethanol. the DNA sample was resuspended in 4 /<L of formamide- 
dye solution and the products resolved on a denaturing polyacrylamide 
gel. Samples were healed at 90 °C for 4 min and chilled in ice for 4 
min prior to electrophoresis. 

Electrophoresis, Autoradiography, and Quantitation by Storage 
Phosphor Imaging. DNA cleavage products were resolved by 
polyacrylamide gel electrophoresis under denaturing conditions (0.3 
mm thick, 8% acrylamidc containing 8 M urea) capable of resolving 
DNA fragments differing in length by one nucleotide. Electrophoresis 
was continued until the bromophenol blue marker had run out of the 
ge! (about 2.5 h at 60 W, 1600 V in TBE buffer, BRL sequencer model 
S2). Gels were soaked in 10% acetic acid for 15 min, transferred to 
Whatman 3MM paper, dried under vacuum at 80 °C, and examined 
by autoradiography using either a phosphorimager or X-ray films (Fuji 
R-X) exposed at -70 °C with an intensifying screen usually for 24 h. 
For quantitative analysis, a Molecular Dynamics 425E Phosphorimager 
was used to collect data from storage screens exposed to the driecl gels 
overnight at room temperature. Base line corrected scans were analyzed 
by integrating all the densities between two selected boundaries using 
ImagcQuant version 3.3 software. Each resolved band was assigned 
to a particular bond within the ryrT(A93) fragment by comparison of 
its position relative to sequencing standards generated by treatment of 
the DNA with formic acid followed by pipcridine-induced cleavage at 
the purine residues (G+A track). 
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ABSTRACT 

The proposition that the 2-amino group of guanine 
plays a critical role in determining how antibiotics 
recognise their binding sites in DNA has been tested 
by relocating it, using tyrT DNA derivative molecules 
substituted with inosine plus 2,6-diaminopurine (DAP). 
Irrespective of their mode of interaction with DNA, 
such GC-specific antibiotics as actinomycin, echino- 
mycin, mithramycin and chromomycin find new bind- 
ing sites associated with DAP-containing sequences 
and are excluded from former canonical sites contain- 
ing UC base pairs. The converse is found to be the case 
for a group of normally AT-selective ligands which bind 
in the minor groove of the helix, such as netropsin: 
their preferred sites become shifted to IC-rich clusters. 
Thus the binding sites of all these antibiotics strictly 
follow the placement of the purine 2-amino group, 
which accordingly must serve as both a positive and 
negative effector. The footprinting profile of the 
'threading' intercalator nogalamycin is potentiated in 
DAP plus inosine-substituted DNA but otherwise 
remains much the same as seen with natural DNA. The 
interaction of echinomycin with sites containing the 
TpDAP step in doubly substituted DNA appears much 
stronger than its interaction with CpG-containing sites 
in natural DNA. 

INTRODUCTION 

Gene targeting via DNA-bincling daigs remains a cherished goal 
of chemotherapy (1-3). If it is to succeed we need a clear 
understanding of the mechanisms whereby small molecules can 
recognise and bind to specific nucleotide sequences in DNA. It 
has long been suspected that the 2-amino group of guanine is the 
prime element which determines sequence recognition via the 
minor groove of the helix where most small molecules bind (4). 
It is the only hydrogen bond donor group exposed there; it 
impedes access to the floor of the groove and it interferes with the 
spine of hydration (5). We have tested the importance of the 
purine 2-amino group by removing it from guanines and 



relocating it on the adenine residues. Footprinting experiments 
performed on the tyrT DNA fragment thus modified reveal that 
the binding (recognition) sites for all small sequence-selective 
ligands tested, including some whose actual specificity is 
unknown, are profoundly changed. 

To accomplish the transfer of the 2-amino group we used the 
polymerase chain reaction to prepare homologous DNA samples 
having guanosine nucleotides replaced by inosines, adenine 
residues replaced by 2,6-diaminopurines (Fig. 1), or both. The 
modified DNA species, as well as normal DNA prepared by the 
same route, were then subjected to DNAase I footprinting in the 
presence of antibiotics known to bind selectively to DNA at 
particular sequences by disparate mechanisms (Fig. 2). As an 
example of a well-characterised intercalator we chose actino- 
mycin D and, for comparison, a bis-intercalator of comparable 
structure and molecular weight, echinomycin: both are cyclic 
depsipeptides endowed with potent antitumour activity (6). As 
examples of I igands binding to the minor groove we looked at two 
antiviral agents, netropsin and distamycin, as well as mithra- 
mycin and chromomycin which are also chemotherapeutic drugs 
used in the treatment of cancer (6). Lastly, we included 
nogalamycin as a representative of the clinically important 
anthracycline group of antitumour antibiotics (7). 

MATERIALS AND METHODS 
Antibiotics 

Actinomycin D, nogalamycin, distamycin, chromomycin and 
mithramycin were purchased from Sigma. Netropsin was from 
Serva and echinomycin was obtained from Parke-Davis (NJ. 
USA). Antibiotics were used as supplied without further purifica- 
tion. The tested drugs showed good aqueous solubility except 
echinomycin which is sparingly soluble in water. Echinomycin 
was dissolved to a concentration of 100 (iM in 10 mM Tris-HCl, 
pH 7.0, 10 mM NaCl containing 40% (v/v) methanol. The stock 
solution was diluted to working concentrations with appropriate 
volumes of 10 mM Tris-HCl, pH 7.0, 10 mM NaCl and methanol 
so as to yield a final methanol concentration of 10% (v/v) in the 
footprinting reactions. Under these conditions methanol is known 
not to affect the nuclease activity (8). Antibiotic concentrations 
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Figure I. Structures of hydrogen -bonded purine-pyrimidine base pairs. 
Broken lines represent hydrogen bonds. I represents inosine; DAP represents 
2,6-diaminopurine (2-aminoadenine). The major and minor grooves of the 
helix are indicated. 



were determined spectroscopically in 10 mm pathlength quartz 
cuvettes through the molar extinction coefficients given in the 
literature. 

Chemicals and biochemicals 

Ammonium persulphate, Tris base, acrylamide, bis-acrylamide, 
ultrapure urea, boric acid, tetramethylethylenediamine and di- 
methyl sulphate were from BDH. Formic acid, piperidine and 
formamide were from Aldrich. Photographic requisites were 
from Kodak. Bromophenol blue and xylene cyanol were from 
Serva. The nucleoside triphosphate labelled with (y- 32 P] ATP was 



Figure 2. Structures of DNA-binding antibiotics. 



obtained from NEN Dupont. Restriction endonucleases EcoR\ 
and Aval (Boehringer), Taq polymerase (Promega), DNase I 
(Sigma) and T4 polynucleotide kinase (Pharmacia) were used 
according to the supplier's recommended protocol in the activity 
buffer provided. The primers, 5'-AATTCCGGTTAC- 
CTTTAATC and 5'-TCGGGAACCCCCACCACGGG having a 
5'-OH or 5'-NH 2 terminal group, were obtained from The 
Laboratory of Molecular Biology, Medical Research Council, 
Cambridge. Checks were carried out to ensure that the primers 
blocked with a 5'-NH 2 group were free from contaminants and 
not labelled by the kinase. All other chemicals were analytical 
grade reagents, and all solutions were prepared using doubly 
deionised, Millipore filtered water. 

Preparation, purification and labelling of DNA fragments 
containing natural and modified nucleotides 

Plasmid pKMp27 (9) was isolated from E.coli by a standard 
sodium dodecyl sulphate-sodium hydroxide lysis procedure and 
purified by banding in CsCl-ethidium bromide gradients. Ethi- 
dium was removed by several isopropanol extractions followed 
by exhaustive dialysis against Tris-EDTA buffer. The purified 
plasmid was then precipitated and resuspended in appropriate 
buffer prior to digestion by the restriction enzymes. The 1 60 base 
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pair / v/T(A93) fragment used as a template was isolated from the 
plasmid by digestion with restriction enzymes EcoRl and Aval. It 
is worth mentioning that this template DNA bore a 5'-phosphate 
due to the action of EcoRl and thus only the newly synthesized 
DNA (with normal or modified nucleotides) can be labelled by 
the kinase. 

Polymerase chain reaction (PCR). The protocol used to incorpor- 
ate inosine and/or 2,6-diaminopurine residues into DNA is 
comparable to those previously used to incorporate 7-deaza- 
purine or inosine residues with only a few minor modifications 
(10.11). PCR reaction mixtures contained 10 ng of ;v/T(A93) 
template. 1 u:M each of the appropriate pair of primers (one with 
a 5'-OH and one with a S'-NH^ terminal group) required to allow 
5'-phosphoryIation of the desired strand, 250 (iM of each 
appropriate dNTP (dTTP. dCTP plus dATP or dDTP and dGTP 
or dITP according to the desired DNA), and 5 U of Tag 
polymerase in a volume of 50 jil containing 50 mM KCI, 10 mM 
Tris-HCL pH 8,3, 0A c /c Triton X-100 and 1.5 mM MgCk To 
prevent unwanted primer-template annealing before the cycles 
began, the reactions were heated to 60°C before adding the Tag 
polymerase ( 1 2). Finally, paraffin oil was added to each reaction 
to prevent evaporation. After an initial denaturing step of 3 min 
at 94°C 20 amplification cycles were performed, with each cycle 
consisting of the following segments: 94°C for 1 min. 37 °C for 
2 min and 72°C for 10 min. After the last cycle, the extension 
segment was continued for an additional 10 min at 72 °C. 
followed by a 5 min segment at 55°C and a 5 min segment at 
37°C. The purpose of these final segments was to maximize 
annealing of full-length product and to minimise annealing of 
unused primer to full-length product. The reaction mixtures were 
then extracted with chloroform to remove the paraffin oil. and 
parallel reactions were pooled. Several extractions with water- 
saturated /z-butunol were performed to reduce the volume prior to 
loading the samples on to a 6% non-denaturing polyacryiamide 
gel. After electrophoresis for -1 h, a thin section of the gel was 
stained with ethiditim bromide so as to locate the band of DNA 
under UV light. The same band of DNA free of ethidium was 
excised, crushed and soaked in elution buffer (500 mM ammon- 
ium acetate. 1 0 mM magnesium acetate) overnight at 37°C. This 
suspension was filtered through a Millipore 0.22 |im filter and the 
DNA was precipitated with ethanol. Following washing with 
70% ethanol and vacuum drying of the precipitate, the purified 
DNA was resuspended in the kinase buffer. 

DNA labelling and purification. The purified PCR products were 
5' end-labelled with [y- 32 P|ATP in the presence of T4 polynu- 
cleotide kinase according to a standard procedure for labelling 
blunt-ended DNA fragments (13). After completion the labelled 
DNA was again purified by 6% polyacryiamide gel electrophore- 
sis and extracted from the gel as described above. Finally, the 
labelled DNA was resuspended in 10 mM Tris-HCl, pH 7.0 
buffer containing 10 mM NaCl. 

DNase I footprinting 

DNase I experiments were performed essentially according to the 
original protocol <S). The digestion of the samples (6 |ilj"of the 
labelled DNA fragment dissolved in 10 mM Tris buffer pH 7.0 
containing 10 mM NaCl was initiated by the addition of 2 ji! of 
a DNase I solution whose concentration was adjusted to yield a 



final enzyme concentration of-0.01 U/ml in the reaction mixture. 
The extent of digestion was limited to <30% of the starting 
material so as to minimize the incidence of multiple cuts in any 
strand ('single-hit' kinetic conditions). Optimal enzyme dilutions 
were established in preliminary calibration experiments. After 3 
min. the digestion was stopped by freeze drying, samples were 
lyophilized, washed once with 50 |il of water, lyophilized again 
and then resuspended in 4 |il of an 80% formamide solution 
containing tracking dyes. Samples were heated at 90°C for 4 min 
and chilled in ice for 4 min prior to electrophoresis. 

Electrophoresis and autoradiography 

DNA cleavage products were resolved by polyacryiamide gel 
electrophoresis under denaturing conditions (0.3 mm thick, S% 
acrylamide containing 8 M urea) capable of resolving DNA 
fragments differing in length by one nucleotide. Electrophoresis 
was continued until the bromophenol blue marker had run out of 
the gel (-2,5 h at 60 W, 1600 V in TBE buffer, BRL sequencer 
model S2). Gels were soaked in 10% acetic acid for 15 min. 
transferred to Whatman 3MM paper, dried under vacuum at 80 °C 
and subjected to autoradiography at -70°C with an intensifying 
screen. Exposure times of the X-ray films (Fuji R-X) were 
adjusted according to the number of counts per lane loaded on 
each individual gel (usually 24 h). 

Quantitation by storage phosphorimaging 

A Molecular Dynamics 425E Phosphorlmager was used to 
collect data from storage screens exposed to the dried gels 
overnight at room temperature (14). Base line-corrected scans 
were analyzed by integrating all the densities between two 
selected boundaries using ImageQuant version 3.3 software. 
Each resolved band was assigned to a particular bond within the 
n/T(A93) fragment by comparison of its position relative to 
sequencing standards generated by treatment of the DNA with 
formic acid followed by piperidine-induced cleavage at the 
purine residues (G+A track). 

RESULTS 

Strong footprints were produced by each ligand on the normal and 
doubly substituted DN As (Fig. 3). Echinomycin is a bis-intercala- 
tor (8,15), and actinomycin a mono-intercalator (16,17), which 
normally bind to CpG and GpC steps respectively: the footprints 
for both antibiotics are radically altered by the nucleotide 
substitution. The same is true for netropsin (18) and distamycin 
(19.20) which are AT-specific minor groove-binders, as we'll as 
for mithramycin and chromomycin (21) which bind in dimeric 
form to GC-rich sequences within the minor groove (22-24). 
Even the binding of nogalamycin (25-27), a 'threading' interca- 
lator which interacts with a puzzline variety of sites in natural 
DNA (28), is changed. 

The canonical and newly created binding sites in the two types 
of DNA can be identified in Figure 4 where the near-inversion of 
the footprinting pattern for most ligands is clearly evident, 
especially in the central portion of the sequence extending from 
position 60 to 100 of the ry/T fragment. Irrespective of their mode 
of binding, the normally GC-selective antibiotics are displaced 
from their clustered sites lying between positions 70 and 80 in 
natural DNA, to pick up new sites in the DAP»T-rich sequences 
either side (panels A. B and D). By contrast, the normally 
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Figure 3. Autoradiograph of a high-resolution denaturing gel showing DNAasc 
I footprinting of antibiotics on the Crick (sense) strand of normal and inosine 
plus DAP atT(A93) DNA. Each pair of lanes corresponds to digestion of 
normal (left) and I + DAP DNA (right). The products of DNAase I digestion 
were resolved on an 89r polyacrylamide gel containing 8 M urea. Their 
identities were assigned by reference to the Maxam-Gilbert guanine markers 
(Line Gh taking into account the difference in mobility of the fragments due to 
the presence or absence of a 3'-phosphate group. Lanes marked control refer to 
the DNAase I digests of the normal and doubly substituted DNA in the absence 
of antibiotic. The other lanes contained 20 U.M actinomvein, 10 \iM netropsin. 
10 pM distamycin. 10 u.M mithramycin, 10 \iM ehromomycin or 5 pM 
nogalamycin. With echinomycin, 20 uM was used to footprint normal DNA 
and 5 U.M for inosine plus DAP-DNA. The scale on the left corresponds to the 
standard numbering of the mT( A93) sequence (9) as represented in Figure 5. 

AT-selective netropsin - is displaced from its canonical sites in 
those flanking sequences (positions 60-69 and 82-90) to bind 
decisively to the intervening IC-rich cluster created in the doubly 
substituted DNA (panel C). Essentially similar results were 
obtained with distamycin and with the synthetic minor groove- 
binding dmgs DAPI and berenil (29,30). 

The case of echinomycin is examined in greater detail in Figure 
5 which shows the complete footprinting pattern on both strands 
of the rv/T(A93) DNA over the whole length of sequence 
accessible to analysis. As previously reported with this restriction 
fragment (8) the canonical sites in natural nvT DNA are squarely 
located around the CpG steps, marked by open rectangles in the 
illustration. In the doubly inosine plus DAP-substituted DNA 
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' 2 T TCATT TT TCTCAACGT AACACT TTACAGCGGCGCGTCATT TGATATGAAGCGCCCCGC 
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Figure 4. Differential cleavage plots comparing the DNAase I-mediated 
cleavage of the Watson (antisense) strand of normal and I + DAP rv/T(A93) 
DNA in the presence of actinomycin (A), echinomycin (B), netropsin (C). 
ehromomycin (D) and nogalamycin (E). Data for distamycin and mithramycin 
are not included; they were much the same as those shown for netropsin and 
ehromomycin respectively. The plots drawn as continuous lines refer to the 
modified ryrT(A93) DNA fragment containing inosine and DAP residues. The 
plots indicated by dashed lines refer to normal nrT(A93) DNA. Antibiotic 
concentrations used to generate these plots were identical to ihose specified in 
Figure 3. Positive and negative values correspond, respectively, to enhanced or 
diminished DNAase I cutting at each intemucleotide bond. The values plotted 
compare the measured probabilities of cleavage expressed in logarithmic units 
and are smoothed by taking a three-bond running average. 

every region protected from DNAase 1 attack occurs at a TpDAP 
step and all such steps (shown as black rectangles) constitute part 
of a ligand binding site. The radically changed pattern of binding 
sites for echinomycin. reflecting its faithful recognition of the 
displaced 2-amino groups, is emphasised by an examination of 
the antibiotic concentration-dependence of the footprinting 
profile (Fig. 6). Over at least two orders of magnitude the 
footprinting pattern remains firmly locked to the sites surround- 
ing the TpDAP steps at positions 61, 69, 89, 111. 127 and 137 on 
the Crick strand of the tyrl fragment. Only at the Tpl step at 
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Figure 5. Differentia! cleavage plots comparing the susceptibility of /v/T(A93) DNA to cutting by DNAase I in the presence of 10 echinomycin. The clashed and 
continuous curves refer to natural and inosine plus DAP-subslituted DNA respectively, as for Figure 4. The upper panel shows differential cleavage of the Watson 
strand, the lower panel that of the complementary Crick strand. The ordinate scales for the two strands are inverted, so that deviation of points towards the lettered 
sequence (negative values) corresponds to a ligand-protected site and deviation away (positive values) represents enhanced cleavage. The filled rectangles near the 
indicated dinucleotide steps show the positions of the TpA steps and the open rectangles show the positions of the CpG steps. Other details as for Figure 4. 



position 82 can any departure from strict adherence to this rule be 
discerned: at very low echinomycin concentrations (<1 |iM) the 
band corresponding to cutting at this step is clearly enhanced but 
by 10-20 |iM the enhancement is lost, suggestive of incipient 
protection (footprinting) at this site. This looks like a classic 
instance of secondary binding, characterised by diminished 
specificity, to a site for which the antibiotic has lower affinity (and 
which lacks one critical 2-amino recognition element) such as is 
commonly seen with many DNA-binding drugs (6). 

A full analysis of the behaviour of actinomycin D has yielded 
equivalent results (31): the binding sites in doubly inosine plus 
DAP-substituted DNA lie squarely over the DAPpT steps and al! 
such steps form part of a binding site for the antibiotic, whereas 
the GpC step is the essential component of canonical recognition 
sequences in natural DNA. Not so for nogalamycin, however: the 
sequences to which this antibiotic binds in the two types of DNA 
are not greatly different though the intensity of the footprinting 
profile is markedly enhanced at specific sites particularly on the 
lower (Crick) strand (Fig. 7). A previous exhaustive footprinting 
study with several DNA restriction fragments (28) concluded that 
most nogalamycin binding sites are located near regions of 
alternating purine-pyrimidine sequence, most commonly asso- 
ciated with the dinucleotide steps TpG (CpA) and GpT (ApC), 
suggesting that the preferred antibiotic binding sites may contain 



all four nucleotides and/or that peculiarities of the dynamics of 
DNA conformation at alternating sequences may be critical for 
nogalamycin binding. Given its unusual mode of 'threading' 
intercalation which probably involves local disruption of base- 
pairing and results in conspicuously slow association and 
dissociation kinetics (32,33) the results in Figure 7 are perhaps 
unsurprising. It is likely that the binding of this ligand is 
dominated by the structural and dynamic features of DNA which 
may well be broadly similar for natural and doubly-substituted 
DNA molecules. For example, the alternation of purines and 
pyrimidines at particular sites remains the same in both types of 
DNA. At all events, there is rather poor correspondence between 
the sites protected from nuclease cleavage by nogalamycin and 
the TpG (CpA) and GpT (ApC) steps in natural DNA or Tpl 
(CpDAP) and IpT (DAPpC) steps in substituted DNA, so the 
process of site recognition by nogalamycin appears to proceed to 
a large extent independently of the placement of the purine 
2-amino group. It is tempting to speculate that the preferred 
nogalamycin binding sites are determined chiefly by substituents 
lying in the major groove of the DNA helix, which is where one 
of the bulky sugar substituents of the antibiotic must come to lie 
(27), and consequently the best sites are to be found at the same 
places in both types of DNA in Figure 7 although the ease of 
binding to such sites reflected in the binding kinetics may be 
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Figure 6. Dtfferentul cleavage plots showmg the concentration-dependence of fooipnnting by 0.2-20 uM echinomycin on the Crick (sense) strand of atT( A9 DNA 
""taming .nos.ne plus DAP residues. Positive and negative values represent, respectively, enhanced or diminished DNAase I cutting efficiency at each internucleotide 
bond. Note that the sequence shown corresponds to that of natural DNA. though here the A residues are replaced bv DAP and the G residues by inosine. Black boxes 
show the locations of the TpDAP steps. A white box is drawn over the Tpl step at position 82. 



much affected, witness the exaggerated excursions of the 
differentia! cleavage plot for the substituted DNA. 

The findings summarised in Figures 3 and 4 establish that the 
purine 2-amino group characteristic of guanine nucleotides serves 
as both a positive and negative effector to determine ligand binding 
sites, and thereby dominates the sequence-recognition process. 
This interpretation is amplified by parallel observations on the 
interaction of the same antibiotics with singly inosine (11) or 
DAP-substituted (34) DNA molecules. The positive role of the 
2-amino group as a necessary component of binding sites for 
GC-selective antibiotics is confirmed by the failure of echinomy- 
cin, actinomycin, chromomycin and mithramycin to footprint on 
inosine-containing DNA, plus their redistribution on to new sites 
in DAP-substituted DNA. Conversely, the negative signal given by 
the 2-amino group as a marker of sites unavailable for binding 
AT-selective antibiotics is rendered obvious by the failure of 
netropsin and distamycin to footprint on DAP-containing DNA, 
whereas they seem to bind all over the inosine-substituted polymer. 

Quantitative analysis of ligand-site interactions backs up'these 
conclusions. The full concentration-dependence profile (35) for 
each antibiotic has been determined at all binding sites and a few 
examples are illustrated in Figure 8. With echinomycin. the 
i'ootprinting on natural DNA at the canonical CpG steps and the 
corresponding enhancement of DNAase I cleavage at TpA steps 
occur with half-maximal effect at concentrations (C 50 ) -2-5 |iM. 
By contrast, the footprinting at TpDAP steps in inosine plus 
DAP-substituted DNA. and enhancement of cutting at Cpl steps, 
take place at much lower concentrations: C50 = 1 u:M or below. 
The same is true for DNA substituted only with DAP We must 
conclude that the new DAP-containing binding sites are superior 
to the canonical CpG-containing sites. Under the conditions of 
these footprinting experiments a large fraction of the added licand 



is likely to remain free, such that C 50 values will approximate to 
thermodynamic dissociation constants for binding to individual 
sites (35). On this basis we estimate that the binding constant for 
echinomycin at several DAP-containing sites must be enhanced 
by a factor of ten or more. With netropsin and distamycin no such 
potentiation of effect occurs: C 50 values for footprinting at the 
newly created IC-containing binding sites, and for enhanced 
nuclease cleavage at DAP«T-rich clusters, are generally some- 
what higher than those required to produce equivalent effects at 
the canonical sites in normal DNA (Fig. 8). With actinomycin and 
the other antibiotics the apparent affinity for new binding sites in 
doubly-substituted DNA is generally little different from that 
measured for the natural tyrl fragment. 

DISCUSSION 

What is the mechanism by which the purine 2-amino group 
dictates where sequence-selective small molecules bind to DNA? 
It could be mediated via direct contact between the interacting 
species involving the formation of hydrogen bonds together with 
elements of steric complementarity— the so-called 'digital' 
readout (36). Alternatively it could occur by a kind of Analogue' 
readout in which the favoured binding sites are recognised by 
virtue of some conformational property such as groove width 
which is only indirectly influenced by the location of the critical 
2-amino group. We have evidence from comparing the reactivity 
of inosine and DAP-substituted DNAs towards structure-sensi- 
tive probes that groove width is a relevant parameter which can 
be strongly affected by re- positioning the purine 2-amino group 
(37). Deformability of the helix is another property which is likely 
to be affected. Either or both of these features of DNA helical 
structure could conspire to determine what nucleotide sequences. 
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Figure 7. Differential cleavage plots comparing the susceptibility of ry/-T(A93) DNA to cutting by DNAase I in the presence of 5 u.M nogalamycin. Details as for 
Figures 4 and 5. Black rectangles indicate the TpG (CpA) steps, open rectangles the GpT (ApC) steps. 
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Figure 8. Footprinting plots (35) for selected bonds in the normal ryrT(A93) fragment (open circles) and its inosine plus DAP-substituted homologue (filled circles). 
The relative band intensity R corresponds to the ratio I c /I u where I c is the intensity of the band at a ligand concentration c and I u is the intensity of the same band in 
the absence of antibiotic. Ligand concentrations are plotted on a logarithmic abscissa and each plot is linked to the appropriate scale on the left or right ordinate. 



in a particular context, might constitute an acceptable ligand 
binding site. Indeed, for a given ligand the part played by any 
single factor ultimately under the control of the purine 2-amino 
group could vary at different potential binding sites. At all events. 



based on our present findings we are confident that efforts to 
design DNA sequence-specific ligands using various motifs, ail 
crucially dependent upon exploiting principles of molecular 
recognition, can be placed on a firmer footing. 
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abstract: Nucleotide analogue interference mapping (NAIM) is a general biochemical method that rapidly 
identifies the chemical groups important for RNA function. In principle, NAIM can be extended to any 
nucleotide that can be incorporated into an in vitro transcript by an RNA polymerase. Here we report the 
synthesis of 5'-0-(l-thio)-A' 2 -methylguanosine triphosphate (m 2 GaS) and its incorporation into two reverse 
splicing forms of the Tetrahymena group I intron using a mutant form of T7 RNA polymerase. This 
analogue replaces one proton of the N2 exocyclic amine with a methyl group, but is as stable as guanosine 
(G) for secondary structure formation. We have identified three sites of m 2 GaS interference within the 
Tetrahymena intron: G22, G212, and G303. All three of these guanosine residues are known to utilize 
their exocyclic amino groups to participate in tertiary hydrogen bonds within the ribozyme structure. 
Unlike the interference pattern with the phosphorothioate of inosine (IaS, an analogue that deletes the N2 
amine of G), m 2 GaS substitution did not cause interference at positions attributable to secondary structural 
stability effects. Given that the RNA minor groove is likely to be widely used for helix packing, m 2 GaS 
provides an especially valuable reagent to identify RNA minor groove tertiary contacts in less well- 
characterized RNAs. 



The wide and shallow minor groove of the RNA A-form 
double helix appears to be commonly used for helix packing 
interactions that are necessary for the formation of RNA 
tertiary structure (/). Structural motifs involving the RNA 
minor groove include the "ribose zipper" and the "wobble 
receptor", both of which employ nucleotide functional groups 
unique to the minor groove surface (2, 3). Unfortunately, 
the minor groove is difficult to analyze biochemically 
because the chemical reagents typically used in footprinting 
and interference studies are not informative for the minor 
groove functional groups. One reagent that is reactive with 
these groups is kethoxal, which forms a covalent bridge 
between the N2 exocyclic amine and N 1 groups of G when 
both functional groups are accessible (4, 5). While this 
reagent is valuable for identifying unpaired Gs within a 
sequence, kethoxal cannot be used to identify tertiary 
interactions within helical segments of the RNA. Thus, 
despite the importance of the minor groove in RNA helix 
packing, there are significant deficiencies in our ability to 
explore the minor groove face of the helix with the probing 
reagents that are currently available. 

Nucleotide analogue interference mapping (NAIM) 1 is an 
efficient method to define the chemical basis of RNA 
function (6—9). In principle, the method is generalizable to 
a wide variety of nucleotide derivatives. In this approach, 
the nucleotide analogue is chemically tagged with a phos- 
phorothioate linkage and randomly incorporated into the 
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RNA transcript. The sites of phosphorothioate incorporation 
are detected by cleavage of the linkage with iodine (10). This 
makes it possible to simultaneously, yet individually, quan- 
titate every position where the nucleotide was incorporated 
into the RNA with an interference assay that is as simple as 
RNA sequencing. This approach has been used with 2'- 
deoxy and 2'-methoxy analogues to identify the 2'-OH groups 
in tRNA and RNase P that are essential for binding (6, 7, 
11). It has also been employed with inosine and a series of 
eight adenosine derivatives including 7-deazaadenosine, 
purine riboside, diaminopurine riboside, and M-methyl- 
adenosine to explore group I intron catalysis (3, 8, 9). 

The Tetrahymena ribozyme provides an ideal system to 
further develop NAIM methodology (Figure 1). The intron 
catalyzes two consecutive transesterification reactions in the 
course of RNA self-splicing (12). It can also catalyze .the 
reverse of these two reactions, resulting in ligation of the 
exon back onto the intron (Figure 2A,B) (13-16). By using 
a radiolabeled oligonucleotide analogue of the exons, the 
active ribozymes in the population become site-specifically- 
labeied upon exon ligation, which provides a remarkably 
simple assay to probe for sites of interference (Figure 2C) 
(8). Furthermore, the Tetrahymena ribozyme ranks as 
possibly the best characterized large RNA, because a crystal 
structure of the P4-P6 domain of the intron (160 of the 414 



1 Abbreviations: m 2 G, N^methylguanosine; m 2 GaS, 5'-0-(l-thio)- 
A^-methylguanosine monophosphate; mKjTPaS, 5'-0-(l-thio>A^-meth- 
ylguanosine triphosphate; GaS, 5'-0-(l-thio)guanosine; GTPaS, 5'- 
0-(l-thio)guanosine triphosphate; IaS, 5'-0-(l-thio)inosine mono- 
phosphate; NAIM, nucleotide analogue interference mapping; dT, 
thymidine; rT, 5-methyluridine; dT(-l)S, CCCUC(dT)AAAAA; 
dT(-l)P, CCCUC(dT); rT(-l)P, CCCUC(rT); SDS, sodium dodecyl 
sulfate; IGS, internal guide sequence. 
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Figure 1: Secondary structure of the L-21 G4I4 form of the 
Tetrahymena group I intron. Nucleotides discussed in the text are 
shown, as are the names of the helical and single-stranded regions 
of the RNA. Other nucleotides are depicted as heavy lines. 
Connectivity within the ribozyme sequence is shown as thin lines, 
and the tertiary hydrogen bonds formed by the three Gs that display 
m 2 GctS interference are shown as dashed lines. This ribozyme binds 
the oligonucleotide dT(-l)S, CCCUC(dT)AAAAA, and transfers 
the AAAAA onto G414 at the 3'-end of the intron in a reaction 
analogous to the reverse of the second step of splicing (15, 16). 

nucleotides) is available, and three-dimensional models of 
the rest of the intron have been proposed based upon 
phylogenetic and other biochemical experiments (2, 17—20). 
Thus, interference in this ribozyme system provides a basis 
set to calibrate NAIM methodology prior to its application 
on less well-characterized RNAs. Yet, even within this 
RNA, the interference results can help to refine and improve 
our understanding of group I intron structure and catalysis. 

Previous mapping experiments of the Tetrahymena group 
I intron with the phosphorothioate of inosine (IocS), a G 
analogue that replaces the N2 exocyclic amine with a proton 
(Figure 3), revealed several sites that interfered with ri- 
bozyme activity (8). Interference at many of these sites is 
unlikely to indicate direct participation of the amine in 
tertiary hydrogen bonding, but rather reflects a loss of duplex 
stability. In hopes of developing a better analogue to analyze 
the role of the minor groove exocyclic amine in tertiary 
structure formation, we synthesized the 5'-0-(l-thio)-A^- 
methylguanosine triphosphate (m 2 GTPaS) and utilized it in 
NAIM. Instead of deleting the amine, this analogue replaces 
one of the amino protons with a methyl group (Figure 3). 
We find that the sites of interference throughout the 
Tetrahymena group I intron are exclusively at positions 
where the exocyclic amine of G is known to participate in 
long-range tertiary hydrogen bonds. Thus, m 2 GaS provides 
a means to identify tertiary interactions within the RNA 
minor groove, a region that is relatively uninformative using 
previously available biochemical methodologies. 

METHODS 

Synthesis of m 2 GTPaS. 0 6 -[2-(NitrophenyI)ethyl]-A^- 
methylguanosine was prepared as previously described (21- 
23). A^-Methylguanosine was synthesized by treating O 6 - 
[2-(nitrophenyl)emyl]-A^-methylguanosine (1.0 g, 2.3 mmol) 
with l,8-diazabicyclo[5.4.0]undec-7-ene (5 mL) at room 
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Figure 2: (A) Scheme for the reaction of the L-2 1 G4 14 ribozyme 
with oligonucleotide substrate. This reaction is analogous to the 
reverse of the second step of splicing (75, 15, 16). The ribozyme 
binds the substrate to form the PI helix, which docks into the active 
site. The terminal guanosine (G414) nucleophilically attacks the 
substrate and transfers the 3'-end of the oligonucleotide onto the 
3 '-end of the intron. The equilibrium constant for the chemical step 
of this reversible reaction is approximately 1 (16). This reaction 
selectively 3'-end-labels the active ribozymes in the population 
when a 3 '-end-labeled substrate (*) is used. (B) Scheme for L+ 1 
Seal ribozyme reaction analogous to the reverse of the first step of 
splicing (14). The ribozyme binds the 5'-exon oligonucleotide 
analogue, where the 3'-OH of the exon attacks the 3'-phosphate of 
Gl, releasing Gl as a free nucleotide and adding the S'-exon onto 
the ribozyme. This reaction selectively 5'-end-labeIs the active 
ribozymes in the population when a 5'-end-labeled (*) oligonu- 
cleotide is used. (C) Scheme for the identification of the chemical 
groups important for RNA activity by NAIM (8). The phospho- 
rothioate-tagged nucleotide analogue (indicated as SaS) is randomly 
incorporated into the transcript in place of G. If m 2 GaS does not 
interfere with function at a particular position (left side), then 
ribozymes with the analogue at that site perform the ligation reaction 
and become radiolabeled. If m^aS disrupts activity (right side), 
then the subset of ribozymes that have m 2 GaS incorporated at the 
susceptible site do not perform the ligation reaction and are not 
radiolabeled. Cleavage of the phosphorothioate linkages by treat- 
ment with iodine and resolution of the cleavage products by PAGE 
produce a sequencing ladder with gaps that correspond to sites 
intolerant of m 2 GaS substitution. GaS serves as a control to ensure 
that loss of activity is not due to the phosphorothioate group. 
Unreacted RNA is also 5'-end-labe!ed to ensure that the gap in the 
sequencing ladder is not due to lack of m 2 GaS incorporation at a 
given site (not shown). 

temperature overnight. Water (50 mL) was added to the 
reaction, and the aqueous phase was extracted with CH2CI2 
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Figure 3: Parental G nucleotide and the two nucleotide analogues 
used for NAIM in this work. Each nucleotide is shown as the 
monophosphate derivative, the form in which it is incorporated 
during transcription. A^-Methylguanosine can adopt either the 
s-trans (shown) or the s-cis rotamer. 

(3 x 100 mL) and with ether (3 x 100 mL). The water 
was removed by rotary evaporation and the reaction dried 
overnight in vacuo. Authentic A^-methylguanosine (24) was 
obtained as an off-white solid (340 mg, 50% yield) by 
recrystallization of the residue with hot methanol. 

Synthesis of the 5'-0-(l-thio)-A^-methylguanosine triphos- 
phate followed the general procedure outlined by Arabshahi 
and Frey (25). A^-Methylguanosine (50 mg, 0. 1 7 mmol) was 
dried under vacuum at 1 10 °C for 16 h and dissolved in 
triethyl phosphate (2 mL). Trioctylamine (83 pL, 0.19 mmol, 
1.1 equiv) and thiophosphoryl chloride (20 pL, 0.19 mmol, 
1.3 equiv) were added to the reaction and stirred under argon 
at room temperature for 30 min to form 5'-0-(l-thio-l,l- 
dichloro)phosphoryl-A^-methylguanosine. The reaction was 
about 40% complete based upon thin-layer chromatography 
(TLC) on cellulose plates using 0.5 M LiCl (aq) as the 
solvent system. This was converted directly to the triphos- 
phate by addition of tributylammonium pyrophosphate (105 
mg, 0.34 mmol, 2.0 equiv) in triethyl phosphate (3 mL) and 
stirring at room temperature for an additional 30 min. 
Formation of the A^-methylguanosine 5'-0-(I-thio)-cyc/o- 
triphosphate was monitored by silica TLC using 6:3:1 
1-propanol, ammonium hydroxide, water as the solvent 
system. In this system, the triphosphate had an R f of 0.2 
compared to 0.6 for the monophosphate and 0.8 for the free 
nucleoside. The triphosphate was precipitated by addition 
of excess triethylamine (2.5 mL), centrifuged, and decanted, 
and the residue was dissolved in aqueous triethylammonium 
bicarbonate (TEAB) (50 mM, pH 7.5, 10 mL). The crude 
product was left at room temperature overnight to achieve 
hydrolytic ring opening of the cyclic triphosphate. Purifica- 
tion by DEAE-A25 Sephadex chromatography using a linear 
gradient of 0.05-0.8 M TEAB afforded S'-O-O-thio)-// 2 - 
methylguanosine triphosphate (m 2 GTPaS) as a diastereo- 
meric mixture in 22% yield. The triphosphate eluted at 
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approximately 0.6 M TEAB. 3, P NMR (H 2 0): 43.37 (m), 
-6.87 (d), -23.00 (t); A m „ 254 nm; € 254 13 000. 

Transcriptional Incorporation of m 2 GaS. Plasmid tem- 
plates for RNA transcription were prepared by ionic ex- 
change chromatography (Qiagen), digested with the appro- 
priate restriction enzyme, phenol extracted, and ethanol 
precipitated. pUCL-2 1G4I4 was cut with Earl, and pUCL+ 1 
was cut with Seal (8). m 2 GaS was randomly incorporated 
into the L-2I G414 or L+l Seal forms of the intron by in 
vitro transcription using the wild-type or Y639F mutant form 
of T7 RNA polymerase (26). RNAs were transcribed in 40 
mM Tris-HCl, pH 7.5, 4 mM spermidine, 10 mM DTT, 15 
mM MgCl 2 , 0.05% Triton X-100, 0.05 pg/pL DNA template, 
and 1 mM CTP, UTP, and ATP. Using the pUCL-21G414 
plasmid, various concentration ratios of m 2 GTPaS (0.1, 0.5, 
1.0, and 2.0 mM) and GTP (0.1, 0.5, and 1.0 mM) were 
tested to identify a transcription condition that gave ap- 
proximately 5% m 2 GaS incorporation as defined by com- 
parison to a transcript made with 50 pU GTPaS (S ? 
diastereomer only) and 1 mM GTP (27). Following this 
determination, the L+l Seal RNA was transcribed using 1.0 
mM m 2 GTPaS, 0.5 mM GTP, and the Y639F polymerase. 
All the RNAs were purified by PAGE (6% denaturing), 
eluted into 10 mM Tris, pH 7.5, 0.1 mM EDTA (TE) 
overnight at 4 °C, precipitated with NaCl and ethanol (-80 
°C for 2 h), resuspended in TE, and stored at -20 °C. 

J'- and 3' -End-Labeling of RNA. Three oligonucleotides 
were utilized in the interference mapping experiments. 
dT(-l)S, CCCUC(dT)AAAAA (20 pmol), was radiolabeled 
at the 3'-end with [a- 32 P]cordycepin by yeast poly(A) 
polymerase (Amersham) (28) and used as a substrate for the 
3'-exon ligation experiments. dT(-l)P, CCCUC(dT), and 
rT(-l)P, CCCUC(rT), were each radiolabeled at the 5'-end 
with [y- 32 P]ATP by T4 polynucleotide kinase and used as 
substrates for the 5'-exon ligation experiments. Both RNAs 
were purified by PAGE (10% nondenaturing), eluted in TE, 
and used in the ligation assays without further treatment. 

L-21 G414 RNAs (2.5 pmol) were treated with calf 
intestinal alkaline phosphatase (2 units, 30 min, 37 °C) to 
remove the 5'-phosphate and heated to 85 °C for 15 min to 
inactivate the phosphatase. The RNAs were 5'-end-labeled 
using T4 polynucleotide kinase (5 units) and [y- 32 P]ATP at 
37 °C for 30 min. The radiolabeled RNAs were purified by 
PAGE (6% denaturing) and eluted into 0.1% SDS in TE 
overnight. The SDS was removed by extraction with 1 
volume of phenol/chloroform (1:1), and the RNAs were 
precipitated with NaCl and ethanol, centrifuged, decanted, 
and resuspended in 50 pL of TE. The specific activities of 
the 5'-end-labeled RNAs (cpm/uL) were normalized by 
scintillation counting. The RNAs were cleaved by the 
addition of 0. 1 volume of 1 00 mM iodine in ethanol and 
heated to 90 °C for 1 min, and the cleavage products were 
resolved by PAGE on a 6% or a 5% denaturing gel. Several 
loadings of the same reaction samples were electrophoresed 
for variable amounts of time (from 1 to 5 h) to maximally 
resolve each region of the sequence. 

Despite several attempts, we were unable to obtain clean 
sequence information by 5'-end-labeling the L+l Seal RNA 
due to some heterogeneity at the 5'-end of the L+l Seal 
RNA. To obtain information for the positions near the 5'- 
end of the transcript that is comparable to that derived from 
the 5 '-end-labeled control, the L+l Seal RNA was incubated 
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with a highly reactive substrate, rT(-l)P, under permissive 
reaction conditions (10 mM MgCl 2 , 50 °C, 50 mM HEPES, 
pH 7.0) for 10 min. The faster reacting ribose substrate 
reduces the amount of interference observed throughout the 
length of the RNA and provides a minimum estimate as to 
the extent of analogue incorporation at each site. This 
version of the "S'-end-labeled control" was used in the L+l 
Seal interference calculations for positions in the internal 
guide sequence (IGS) and P2 helix (8). 

m 2 GaS Interference Mapping of the Tetrahymena Ri- 
bozyme. Interference mapping of m 2 GaS was performed 
using 50 nM ribozyme and 10 nM oligonucleotide substrate 
in a buffer containing 4 mM MgCl 2 , 50 mM HEPES, pH 
7.0, at 50 °C for 10 min. The L-21 G414 ribozyme was 
incubated with 3 '-end-labeled dT(-l)S, and the L+l Seal 
ribozyme was incubated with S'-end-labeled dT(-l)P. The 
ligation reactions were stopped by the addition of 1 volume 
of urea loading buffer (8 M urea, 50 mM EDTA, 0.01% 
bromophenol blue, 0.01% xylene cyanol) and worked up with 
iodine as described above. The reaction containing no iodine 
was run in parallel to confirm that the cleavage pattern was 
specific to the iodine treatment and not due to nonspecific 
degradation. The transcriptional efficiencies of m 2 GaS 
incorporation were determined using the 3'-exon ligation 
reaction of L-21 G414 RNAs with dT(-l)S. Relative 
efficiencies were calculated by comparing the intensity of 
the cleavage products throughout the length of the intron to 
those of the GaS control (5% incorporation standard). 

Interference Quantitation. Peak intensities for both the 
parental nucleotide (GaS) and the nucleotide analogue 
(m 2 GaS) were quantitated by Phosphorlmager analysis at 
each position for the 3'-exon ligation or 5'-exon ligation 
experiments and the S'-end-labeled control. The extent of 
interference at each position was calculated by substituting 
the band intensities at each nucleotide position into the 
equation: 

Interference — 

GaS ligation reaction/m 2 GaS ligation reaction ^ 
GaS labeled control/m 2 GaS labeled control 

The resulting interference value normalizes for phospho- 
rothioate effects assumed to be equivalent for both GaS and 
m 2 GaS. It also controls for variability in the extent of 
analogue incorporation or reactivity with iodine at each 
position. All the interference values were further normalized 
to account for differences in loading and extent of reaction 
between lanes by calculating the average interference value 
at all positions in the RNA that were within two standard 
deviations from the mean and dividing each individual 
interference value by the normalized average (the averages 
ranged from 0.8 to 1.2). This resulted in an interference k 
value for each position in both the 3'-exon and 5'-exon 
ligation reactions. A k value of I indicates that there is no 
effect of substituting the analogue at that site, a value greater 
than 1 indicates inhibition of activity, and a value less than 
1 indicates that activity is enhanced by analogue substitution 
at that site. 

RESULTS 

Transcriptional Incorporation of m 2 GaS. The a-phos- 
phorothioate-tagged triphosphate of m 2 G was synthesized for 
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use in NAIM. To be useful in this method, it must be 
efficiently and accurately incorporated into an RNA by in 
vitro transcription. Initial efforts to incorporate m 2 GaS were 
unsuccessful using T7 RNA polymerase, probably because 
the methyl group occupies a prominent position in the minor 
groove that is likely to be involved in the error-reading 
mechanism of the polymerase (29). No incorporation was 
observed even at high ratios of m 2 GTPaS to GTP (data not 
shown). We had previously obtained and overexpressed a 
Y639F point mutant of T7 RNA polymerase to perform 
NAIM with 2'-deoxynucleotides. This mutation was reported 
to cause reduced selectivity for the 2'-position of the 
nucleotide during transcription (26). The mutant polymerase 
efficiently incorporates 2'~deoxy, 2'-methoxy, 2'-fluoro, and 
2'-thio nucleotide triphosphates into RNA transcripts (9, 30- 
32). It seemed reasonable to expect that this polymerase 
might also have enhanced tolerance for an additional methyl 
group in the helical minor groove. 

We repeated the transcriptions using the Y639F polym- 
erase and found that m 2 GaS was efficiently incorporated into 
the L-21 G414 RNA. An incorporation level of ap- 
proximately 5% was achieved using a ratio of 1.5 mM 
m 2 GTPaS to 0.5 mM GTP. Based upon iodine treatment 
of the S'-end-labeled L-21 G414 transcript, m 2 GaS was 
incorporated at every G position within the RNA, and no 
significant incorporation was detected at any non-G sites 
within the sequence (Figure 4A, lanes 2 and 3). Furthermore, 
the efficiency of incorporation at each G was generally 
equivalent to that seen for the GaS standard. 

m 2 GoS Interference Mapping. The sites of m 2 GaS 
incorporation that interfere with intron activity were mapped 
using the L-21 G414 ribozyme (8). pUCL-21G414 encodes 
a form of the group I intron that includes the terminal G414, 
but lacks the first 21 nucleotides of the intron. In the 
presence of an oligonucleotide substrate analogous to the 
S'-3' ligated exon product, dT(-l)S, this intron can perform 
the reverse of the second step of splicing by using the 3'- 
OH of the terminal G414 as the nucleophile to attack the 
splice site between the exons (Figure 2A) (15, 16). This 
reaction transfers the 3'-exon onto the 3'-end of the intron, 
which radiolabels the active molecules in the population if 
the oligonucleotide substrate is 3 '-end-labeled. Following 
the ligation reaction, the RNAs are digested with iodine and 
the cleavage products resolved by PAGE (Figure 4A). The 
intensities of individual bands in the m 2 GaS and GaS 
cleavage ladders were compared to the 5'-end-labeled 
controls to identify the sites of interference. Although the 
majority of the 107 Gs within the ribozyme were informative, 
complete data could not be obtained for 10 sites due to the 
inability to separate the cleavage products at the positions 
furthest from the location of the radiolabel (G22-G32 for 
the 3'-exon ligation reaction, and G405-G414 for the S'- 
end-labeled control). As might be expected, most positions 
did not show any effect upon analogue substitution. Greater 
than 95% of the interference k values were between 0.67 
and 1.5. 

Two sites, G212 and G303, showed complete interference 
with m 2 GaS in the 3'-exon ligation reaction (Figure 4A, lanes 
8 and 9; Figure 5). Both of these positions are completely 
conserved among the 131 examples of IC1 and IC2 introns, 
of which the Tetrahymena intron is a member (33). The 
only other Gs that are conserved to this level among IC1 
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Figure 4- Analogue incorporation and interference reactions for 
3'- and 5'-exon ligation. (A) Interference mapping for the 3 -exon 
ligation reaction. The phosphorothioate-tagged analogue incorpo- 
rated into each RNA is listed above the lane numbers. The 
nucleotide number corresponding to several of the bands is marked 
to the left of each gel. The addition (lanes 1-3 7-9) or omission 
(lanes 4-6 10-12) of iodine is indicated. 5 -End Labeled 
Control- The L-21 G414 5'-end-labeled control showing the extent 
and positions of analogue incorporation throughout the intron. 
Cleavage products at positions G212 and G303 (asterisks) show 
m 2 GaS was incorporated at these sites (lane 3). This particular gel 
was electrophoresed at 75 W for 2.5 h. Longer electrophoretic times 
were used to improve the signal resolution of the nucleotides toward 
the 3'-end of the RNA (not shown). 3'-Exon Ligation Reaction: 
The 3'-exon ligation reaction of L-21 G414 RNA with dT(-l)S. 
This autoradiogram reveals the sites of analogue interference 
throughout the intron. The m^aS cleavage products at G303 and 
G212 (asterisks) are of substantially lower intensity than m the 5 - 
end-labeled control (lane 9). This particular gel was electrophoresed 
at 75 W for 1.75 h. Longer electrophoretic times were used to 
resolve the cleavage products toward the 5'-end of the intron (not 
shown) (B)The 5'-exon ligation reaction of L+l Seal RNA with 
rT(-l)P or dT(-l)P in 10 or 4 mM MgCl 2 , respectively. This 
autoradiogram shows the interference results for nucleotide positions 
within the IGS. The asterisk is to call attention to the m 2 GaS 
interference seen at G22. m 2 GaS interference was also observed 
at G212 (not shown). Interference was observed with IaS at all 
the positions within the IGS (compare lanes 3 and 6). The no- 
iodine control lanes are omitted, but they were equivalent to those 
shown in panel A. 



Biochemistry, Vol. 37, No. 37, 1998 12937 

and IC2 introns arc G22 and G264. G22 was not informative 
in this assay. G264 is known to be essential to intron 
function because it forms part of the G binding site, but it 
does this using functional groups in the major groove of the 
P7 helix that would not be affected by m 2 GaS substitution 
(34). No other sites showed detrimental effects due to 
m 2 GaS substitution, including Gl 1 1 and Gl 12 where modest 
interference was previously observed with IaS substitution 
(8). G303 also showed interference with IaS, though there 
was no IaS interference at G212. 

One region of particular interest within the ribozyme is 
the internal guide sequence (IGS) strand of the PI helix 
(Figure 1). Unfortunately, the five Gs within the IGS were 
uninformative because the cleavage products could not be 
resolved in the 3'-exon ligation reaction. To gain information 
about these nucleotides, we used the L+l Seal ribozyme, 
which lacks the last 3 nucleotides at the 3'-end of the intron 
(including G414), but includes the first 21 nucleotides at the 
5'-end (8). This form of the intron includes a G as the first 
base of the intron that is equivalent to the exogenous G added 
to the 5'-end of the intron after the first step of splicing (35). 
In the presence of a 5'-exon oligonucleotide substrate, this 
intron performs a reaction that is analogous to the reverse 
of the first step of splicing, wherein it transfers the 5'-exon 
onto the 5'-end of the intron with concomitant release of 
the terminal G (13, 14) (Figure 2B). This reaction places 
the radiolabel at the 5'-end of the intron where the IGS 
nucleotides can be readily resolved. Using this reaction, all 
five of the Gs within the PI helix showed interference with 
IaS, though only G22 is conserved (8, 19). 

In contrast to the results with IaS interference, G22 was 
the only position in the IGS that showed interference from 
m 2 GaS substitution (Figure 4B). No m 2 GaS interference 
was detected at G23, G25, G26, or G27. G22 is universally 
conserved among all known group I introns and is essential 
for defining the 5'-splice site (33, 36-38). In the 5'-exon 
ligation reaction, strong interference was also detected at 
G212, though there was no interference at G303 (data not 
shown). Thus, using two G analogues that alter the exocyclic 
amine, there are sites where interference was observed with 
both analogues (G22 and G303), as well as sites where only 
one of the analogues interfered with activity; G212 only 
showed interference with m 2 GaS, while G23, G25, G26, 
G27, Gill, and G112 only showed interference with IaS. 

Mutagenesis of the Closing Base Pair in the P4 Helix. 
Given the reduced stabilities of duplexes containing inosine 
compared to m 2 G (39-41), the lack of m 2 GaS interference 
at some of the IaS interference sites suggests that reduced 
duplex stability is the likely cause of IaS interference at these 
positions. This interpretation implies that the stability of the 
closing G-C base pair in the P4 helix might be important 
for 3'-exon ligation activity. To test this possibility directly 
in the L-21 G414 ribozyme, we mutated the G112-C208 
pair to an Al 12-U208 pair and measured the rates of 3'- 
exon ligation with the dT(-l)S substrate under the same 
conditions as those used for the interference experiments (16). 
*«i for 3'-exon ligation was unaffected by the base pair 
mutation (0.015 ± 0.002 and 0.014 ± 0.001 min" 1 for 
G112-C208 and A112-U208, respectively), but K m in- 
creased by approximately 2.5-fold (54 and 140 nM, respec- 
tively). \ ; \ . 
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average of at least two, and as many as eight, independent experimental measurements. 



DISCUSSION 

Nucleotide analogue interference mapping (N AIM) is a 
general biochemical method to identify the chemical groups 
that are important for RNA function (Figure 2C). In a 
previous report, we used IaS to identify the G positions in 
the Tetrahymena intron where the exocyclic amine is 
important for 3'-exon and 5'-exon ligation (5). These 
included all five of the Gs within the internal guide sequence 



(IGS), the closing two G-C base pairs at the top of the P4 
helix, and G303 which is located within the J8/7 single- 
stranded segment that threads through the center of the 
catalytic core (Figure 5). Based upon sequence conservation 
and mutagenesis experiments, not all the sites of IaS 
interference are expected to be involved in tertiary contacts 
within the ribozyme fold (33, 42). 

Inosine substitution, which deletes theN2 exocyclic amine 
of G, can affect at least two aspects of RNA folding, duplex 
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stability and tertiary structure formation. Thermodynamic 
characterization of model RNA duplexes containing inosine 
substitutions shows that an I-C pair is between 1 and 2 
kcal-mol" 1 less stable than a G-C pair (39, 40). Yet, while 
inosine substitution can destabilize RNA duplexes, it can also 
have dramatic effects on ribozyme activity if the N2 amine 
participates in tertiary hydrogen bonding, such as seen at 
position G22 within the Tetrahymena intron (43). Because 
inosine affects both secondary and tertiary stability, it is not 
possible with IaS interference mapping alone to distinguish 
between these two contributions to RNA structure. 

A^-Methylguanosine (m 2 G) is a second analogue that 
modifies the exocyclic amine of G, but instead of deleting 
the functional group, it substitutes one proton of the amine 
with a methyl group (Figure 3). Duplex studies with m 2 G 
have shown that it can form base pairs with C, U, and A 
that are at least as stable as those formed by G, which 
indicates that the methyl group is equally stable on either 
the s-trans or the s-cis face of the base (41, 44). We reasoned 
that this property might make it possible to use the phos- 
phorothioate of m 2 G to probe specifically for essential minor 
groove tertiary contacts in RNA without encountering the 
secondary structural effects that are observed with IaS. 

Toward this objective, we synthesized the phosphorothio- 
ate of A^-methylguanosine (m 2 GaS) and randomly incor- 
porated it into two reverse splicing forms of the Tetrahymena 
group I intron using a mutant form of the T7 RNA 
polymerase, We observed three sites of m 2 GaS interference 
within the Tetrahymena intron, G22, G212, and G303 (Figure 
5). These sites are located in the PI helix, P4 helix, and 
J8/7 single-stranded region, respectively (Figure 1). All three 
of the Gs are completely conserved among the IC1-IC2 
introns (which includes the Tetrahymena intron) (55), and 
all three Gs are known to participate in minor groove tertiary 
hydrogen bonding between helical elements of the ribozyme 
structure (2, 3, 45) (Figure 6). 

Interference of PI Helix Docking into J4/5. G22 forms a 
wobble pair with U-l at the 5'-splice site of the intron (46), 
and these two bases are the only conserved residues within 
the PI helix (Figure 1) (19, 42). While U-l is essential for 
holding the G in a wobble configuration, U-l does not 
participate directly in ground-state tertiary interactions with 
the intron core (43). Instead, the G utilizes its amine and 
its 2'-OH to dock into a wobble receptor located within the 
J4/5 region (3, 47, 48). Within a G*U wobble pair, neither 
of the amino protons participate in duplex formation; 
however, both protons of the G22 amino group participate 
in tertiary hydrogen bonds with the two consecutively stacked 
sheared A-A pairs that constitute the wobble receptor (5). 
One proton donates a hydrogen bond to the N3 of A207 
while the second proton donates a hydrogen bond to the 2'- 
OH of A207 (Figure 6A). Interference with m 2 GaS and IaS 
is in full agreement with this model for PI helix docking, 
because placement of the m 2 GaS methyl group onto either 
the s-cis or the s-trans face of the nucleotide would disrupt 
a hydrogen bond essential for docking the 5'-exon into the 
active site. 

Interference of the P4 Interaction with the A-Rich Bulge. 
G212 base-pairs with CI 09 in the P4 helix. A G-C pair 
utilizes one of the exocyclic amino protons for duplex 
formation, but leaves the second proton unpaired in the minor 
groove. The crystal structure of the P4— P6 domain dem- 
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Figure 6: Tertiary hydrogen bonding interactions postulated for 
the three Gs (bold) that show m 2 GaS interference. In each case, 
the amino group that shows interference is highlighted with a box. 
(A) G22 forms a wobble pair with U-l and makes a tertiary 
interaction with A207 (5). (B) G212 forms a Watson-Crick pair 
with CI 09 and makes a tertiary interaction with Al 84 (2). (C) G303 
is in a single-stranded region of the ribozyme, but makes a tertiary 
interaction with C-2 (45). 

onstrated that G212, and nucleotides surrounding it, makes 
extensive minor groove tertiary interactions with the A-rich 
bulge of the P5abc extension (Figure 1) (2, 49). These 
contacts are largely sequence-nonspecific hydrogen bonds 
to 2'-OHs on the P4 helix. The single exception is a tertiary 
hydrogen bond between the N2 amine of G212 and the N3 
of A 184 (Figure 6B). Strong interference at G212 with 
m 2 GaS suggests that the interaction between P4 and the 
A-rich bulge is important for activity, though the absence 
of IaS interference indicates that deletion of the hydrogen 
bond is not sufficient to affect intron splicing. In this closely 
packed region of the RNA, the additional steric bulk of the 
N2-methyl group in the P4 minor groove is likely to be more 
destabilizing than deletion of the amine. 

m 2 GaS interference at G212 is the first example of 
significant loss of intron splicing activity resulting from a 



12940 Biochemistry, Vol. 37, No. 37, 1998 

single point mutation or functional group change in the 
interface between P5abc and the P4-P6 helical stack (9, 50). 
Apparently, there is sufficient energetic redundancy to 
overcome most minor structural alterations in the context of 
the intact intron (50). The P4-P6 crystal structure suggests 
three other sites within the domain that are strong candidates 
for m 2 GaS interference (2). The N2 amino groups of G 150 
and G250 participate in the tetraloop-tetraloop receptor 
interaction between P5b and P6a, and the amine of G181 
helps form the substructure of the A-rich bulge; however, 
interference was not observed at any of these three sites in 
either ligation assay (Figure 5). m 2 GaS interference at G212, 
but not at G150 or G250, suggests that the A-rich bulge 
interaction with P4 may be more important for intron function 
than the tetraloop/tetraloop receptor interaction. 

Interference from PI Helix Docking into J8/7. Unlike the 
other two sites that demonstrated m 2 GaS interference, G303 
is in a single-stranded segment of the intron, J8/7 (Figure 
1), which forms an extended triple helix complex with the 
minor groove of the PI helix (45). Interference suppression 
experiments using a substrate with a 2'-deoxy substitution 
at C-2 demonstrated that G303* forms a tertiary hydrogen 
bond between its exocyclic amine and the 2'-OH of C-2 in 
the PI helix (Figures 1 and 6C) (45). However, the inter- 
action proposed to form between these two residues only 
utilizes one of the G303 amino protons. Presumably m 2 GaS 
could adopt the s-cis rotamer to facilitate hydrogen bonding 
by the s-trans proton of the amine. The fact that interference 
is observed with both IaS and m 2 GaS suggests either that 
the s-cis proton is involved in an additional hydrogen bond 
or that there is insufficient space in the P1-J8/7 packing 
interface to accommodate an s-cis methyl group. 

It is somewhat curious that IaS and m 2 GaS interference 
is observed at G303 for the 3'-exon but not the 5'-exon 
ligation reaction. While this could reflect an important 
conformational difference between the first and second steps 
of splicing, the more likely explanation is that there is 
additional energetic redundancy for PI helix docking within 
the 5'-exon ligation reaction. In the L+l Seal construct, 
the reactive G is covalently connected to the PI helix. 
Additional stabilization energy for PI docking is provided 
by G binding into the P7 helix, which would make the G303 
interaction with C-2 less critical. In the L-21 G414 ri- 
bozyme, the reactive G is at the 3'-end of the RNA where it 
does not contribute directly to PI helix binding. In this 
arrangement, the interaction between G303 and C-2 is 
necessary for alignment of the substrate into the active site. 

Interference with IaS but Not m 2 GaS Suggests Duplex 
Stability Is Important. There are six sites that showed 
interference with IaS that were not sensitive to m 2 GaS 
substitution. All the sites are in double-stranded regions of 
the intron, and they cluster into two helices. Four of the 
sites (G23, G25, G26, and G27) are in the PI helix, and two 
of the sites (Gill and GI 12) are at the top of the P4 helix 
just below the sheared A-A pairs in J4/5. 

The only phylogenetically conserved sequence in the PI 
helix is the G22-U-1 base pair at the cleavage site. The 
remaining positions maintain complementarity between the 
5'-exon and the IGS, but the primary sequence is not 
conserved and can be mutated as long as base-pairing is 
retained (42). This lack of sequence conservation suggests 
that base-specific functional groups do not participate directly 
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in tertiary interactions with the ribozyme catalytic core, which 
implies that IaS interference at these sites results from 
reduced secondary structural stability rather than the loss of 
a tertiary hydrogen bond to an N2-exocyclic amine. In a 
competitive environment where multiple ribozymes (50 nM) 
are vying for a slow reacting substrate (Ad = 0.015 min" 1 , 
kvtr > 0.1 min" 1 ) that is present in limiting quantity (10 nM, 
0.2 equiv), the inability of some ribozymes to form a stable 
duplex between the IGS and the substrate confers a signifi- 
cant disadvantage compared to other variants in the popula- 
tion. In contrast, interference was not observed with m 2 GaS 
at the nonconserved positions in the PI helix, because this 
analogue does not affect duplex stability (41), Therefore, 
interference from IaS, but not from m 2 GaS, defines a 
biochemical signature to identify positions where secondary 
structural stability is critical for RNA function. 

This pattern of interference is also seen within the P4 helix 
where nucleotides Gl 1 1 and Gl 12 showed weak interference 
with IaS. This is a challenging region of the sequence to 
analyze experimentally by NAIM because it is difficult to 
resolve the cleavage products for the 3 consecutive Gs 
(G110-G112) that are located more than 300 nucleotides 
from the radiolabel at the 3'-terminus. Furthermore, interfer- 
ence at these sites appears to be highly dependent upon the 
reaction conditions. For example, IaS interference at Gl 1 1 
and Gl 12 was not observed under slightly modified divalent 
metal conditions (such as 3 mM Mg 2+ and 1 mM Mn 2+ ; M. 
Kronman and S. A. Strobel, unpublished results) nor was it 
observed in the 5'-exon ligation reaction (8). Nevertheless, 
the interference data with IaS and m 2 GaS at 4 mM Mg 2+ 
imply that ribozymes with a stable Gl 12-C208 closing base 
pair in the P4 helix have a modest selective advantage over 
those with an IaS112-C208 pair. 

To directly test the importance of the P4 closing base pair, 
we mutated the Gl 12-C208 pair to A-U in the context of 
the L-21 G414 ribozyme and measured the second-order rate 
constant for 3'-exon ligation under the same conditions as 
those used for the interference experiments (16). for the 
mutant enzyme was the same as the wild-type, but the K m 
was 2.5-fold higher. The magnitude of the IaS interference 
suggested that the effect would be larger, but it is possible 
that the A-U mutation is not as destabilizing as the IaS-C 
pair that occurs in the interference experiment. Thermody- 
namic measurements of model duplexes have shown that an 
I-C pair is in fact slightly less stable (0.3 kcal-mol" 1 ) than 
an A-U pair (39). 

The Gl I2-C208 pair is immediately below two consecu- 
tive sheared A-A pairs that act as a wobble receptor for the 
PI helix (Figure 1) (3). Thermodynamic measurements of 
model duplexes containing consecutive A-A or G-A mis- 
matches have demonstrated that the mispairs show a net 
stabilization from a G-C closing pair, but a net destabiliza- 
tion from an A-U closing pair (51, 52). One dramatic 
consequence of this destabilization is that mutation of the 
closing base pair converts the G*A mispair from a sheared 
conformation to a imino hydrogen-bonded conformation (53, 
54). While the A-A pair cannot undergo this secondary 
structural transition, there is a precedent to argue that the 
stability of the P4 closing pair is important to stabilize the 
consecutively stacked sheared A-A pairs located immediately 
above it. V: 
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m'GaS Interference Only at Sites of Direct Tertiary 
Hydrogen Bonding. Immediately adjacent to .the sites of 
xJgoI interference at G22 and G212 within the PI and P4 
helices, respectively, there are G-C pairs that do not show 
interference (Figures 1 and 5) These Gs pamcipate m» 
tertiary hydrogen bonds, but utilize func ,onal groups _other 
than their N2 amines. For example the 2'-OHs of ihe G23- 
C-2 pair each are thought to participate in a direct hydrogen 
bond- 1 2'-OH of G23 bonds with the 2'-OH of C208 P) 
and the 2'-OH of C-2 bonds with the N2 amine of G303 
(45) Despite tremendous close packing within the minor 
oroove face of this region of the molecule, interference was 
not observed with m*GaS at G23. A ^sunilar pattern w« 
observed in the P4 helix, where the 2'-OHs of Gl 10, G212, 
and C109 all participate in hydrogen bonds to groups within 
the A-rich bulge. Nevertheless, m'GaS interference was not 
observed in the G110-C211 base pair above the CI 09- 
G212 pair, nor was it observed in the G108-C213 base pair 
below G212 Thus, within the Tetrahymena nbozyme, 
nrKSoS interference is only observed at Gs that participate 
directly in tertiary hydrogen bonds via their exocychc amines, 
not just at Gs that are closely packed within the tertiary 

structure. . . 

All three of the sites of m 2 GoS interference within the 
Tetrahymena intron are highly conserved and participate in 
tertiary hydrogen bonding interactions via their N2 amino 
groups The sites of interference within this intron occurred 
at a G-C pair, a G-U pair, and an unpaired G. Given that 
the wide and shallow RNA minor groove is likely to be a 
common interface for RNA helix packing, and that there are 
relatively few reagents that probe this face of the helix, 
m J GaS provides a valuable tool for RNA structure/function 
mapping in less well-characterized RNAs. This is part.cu- 
larly true if m J GaS is used in combination with IaS, because 
together these analogues make it possible to differentiate sues 
of essential secondary structural stability from sites of tertiary 
hydrogen bonding. 
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